Lecture 18: 0xCAFEBABE (Java Byte Codes) CS201j: Engineering Software University of Virginia Computer Science David Evans http://www.cs.virginia.edu/evans Menu • Running Programs – Crash Course in Architecture (CS333) – Crash Course in Compilers (CS571) • Java Virtual Machine • Byte Codes 4 November 2003 CS 201J Fall 2003 2 Computer Architecture Processor does computation Memory stores bits Input Devices (mouse, keyboard, accelerometer) get input from user and environment Output Devices (display, speakers) present output to user 4 November 2003 CS 201J Fall 2003 3 Central Processing Unit (CPU) 4 November 2003 CS 201J Fall 2003 4 Intel 4004 • First general purpose microprocessor, 1971 • 4-bit data • 46 instructions – 8-bit instructions! 4 November 2003 CS 201J Fall 2003 5 PC Motherboard Memory CPU From http://www.cyberiapc.com/hardwarebeg.htm 4 November 2003 CS 201J Fall 2003 6 Inside the CPU • Registers • Loads and decodes instructions from memory • ALU: Arithmetic Logic Unit – Does arithmetic – Can only operate on values in registers – Must load values from memory into registers before computing with them 4 November 2003 CS 201J Fall 2003 7 Compiler • Translates a program in a high-level language into machine instructions • Calling convention – How are parameters passed to functions – How is the stack managed to return • Register allocation – Figure out how to use registers efficiently 4 November 2003 CS 201J Fall 2003 8 6: int max (int a, int b) { push ebp push instruction is 1 byte 00401010 00401011 mov ebp,esp mov instruction is 2 bytes 00401013 Dealing with sub esp,40h function call: 00401016 push ebx updating 00401017 push esi stack, 00401018 push edi moving 00401019 lea edi,[ebp-40h] In Visual C++, see arguments 0040101C mov ecx,10h assembly 00401021 mov eax,0CCCCCCCCh code by running Debug, 00401026 rep stos dword ptr [edi] then 7: if (a > b) { Window | Disassembly 00401028 mov eax,dword ptr [ebp+8] 0040102B cmp eax,dword ptr [ebp+0Ch] int max (int a, int b) { 0040102E jle max+25h (00401035) if (a > b) { 8: return b; return b; 00401030 mov eax,dword ptr [ebp+0Ch] } else { 00401033 jmp max+28h (00401038) return a; 9: } else { } 10: return a; } Cleanup and return 4 November 2003 CS 201J Fall 2003 9 00401035 00401038 00401039 0040103A 0040103B 0040103D 0040103E mov pop pop pop mov pop ret eax,dword ptr [ebp+8] edi esi ebx esp,ebp ebp Java Virtual Machine 4 November 2003 CS 201J Fall 2003 10 Java Ring (1998) 4 November 2003 CS 201J Fall 2003 11 Java Card 4 November 2003 CS 201J Fall 2003 12 Java Virtual Machine • Small and simple to implement • All VMs will run all programs the same way • Secure 4 November 2003 CS 201J Fall 2003 13 Implementing the JavaVM load class into memory set the instruction pointer to point to the beginning of main do { fetch the next instruction execute that instruction } while (there is more to do); Some other issues we will talk about Thursday and next week: Verification – need to check byte codes satisfy security policy Garbage collection – need to reclaim unused storage 4 November 2003 CS 201J Fall 2003 14 Java Byte Codes • Stack-based virtual machine • Small instruction set: 202 instructions (all are 1 byte opcode + operands) – Intel x86: ~280 instructions (1 to 17 bytes long!) • Memory is typed • Every Java class file begins with magic number 3405691582 = 0xCAFEBABE in base 16 4 November 2003 CS 201J Fall 2003 15 Stack-Based Computation • push – put something on the top of the stack • pop – get and remove the top of the stack Stack push 2 push 3 add 2 5 3 Does 2 pops, pushes sum 4 November 2003 CS 201J Fall 2003 16 Some Java Instructions Opcode Mnemonic Description 0 nop Does nothing 1 aconst_null Push null on the stack 3 iconst_0 Push int 0 on the stack 4 iconst_1 Push int 1 on the stack … 4 November 2003 CS 201J Fall 2003 17 Some Java Instructions Opcode 18 Mnemonic ldc <value> Description Push a one-word (4 bytes) constant onto the stack Constant may be an int, float or String ldc “Hello” ldc 201 The String is really a reference to an entry in the string constant table! 4 November 2003 CS 201J Fall 2003 18 Arithmetic Opcode 96 Mnemonic iadd Description Pops two integers from the stack and pushes their sum iconst_2 iconst_3 iadd 4 November 2003 CS 201J Fall 2003 19 Arithmetic Opcode Mnemonic Description 96 iadd Pops two integers from the stack and pushes their sum 97 ladd Pops two long integers from the stack and pushes their sum … 106 fmul Pops two floats from the stack and pushes their product … 119 dneg Pops a double from the stack, and pushes its negation 4 November 2003 CS 201J Fall 2003 20 Java Byte Code Instructions • 0: nop • 1-20: putting constants on the stack • 96-119: arithmetic on ints, longs, floats, doubles • What other kinds of instructions do we need? 4 November 2003 CS 201J Fall 2003 21 Other Instruction Classes • Control Flow (~20 instructions) – if, goto, return • Method Calls (4 instructions) • Loading and Storing Variables (65 instructions) • Creating objects (1 instruction) • Using object fields (4 instructions) • Arrays (3 instructions) 4 November 2003 CS 201J Fall 2003 22 Control Flow • ifeq <label> Pop an int off the stack. If it is zero, jump to the label. Otherwise, continue normally. • if_icmple <label> Pop two ints off the stack. If the second one is <= the first one, jump to the label. Otherwise, continue normally. 4 November 2003 CS 201J Fall 2003 23 Method Calls • invokevirtual <method> – Invokes the method <method> on the parameters and object on the top of the stack. – Finds the appropriate method at run-time based on the actual type of the this object. invokevirtual <Method void println(java.lang.String)> 4 November 2003 CS 201J Fall 2003 24 Method Calls • invokestatic <method> – Invokes a static (class) method <method> on the parameters on the top of the stack. – Finds the appropriate method at run-time based on the actual type of the this object. 4 November 2003 CS 201J Fall 2003 25 Example public class Sample1 { static public void main (String args[]) { System.err.println ("Hello!"); System.exit (1); } } 4 November 2003 CS 201J Fall 2003 26 > javap -c Sample1 public class Sample1 { static public void main (String args[]) { System.err.println ("Hello!"); System.exit (1); } } Compiled from Sample1.java public class Sample1 extends java.lang.Object { public Sample1(); public static void main(java.lang.String[]); } Method Sample1() 0 aload_0 1 invokespecial #1 <Method java.lang.Object()> 4 return Method void main(java.lang.String[]) 0 getstatic #2 <Field java.io.PrintStream err> 3 ldc #3 <String "Hello!"> 5 invokevirtual #4 <Method void println(java.lang.String)> 8 iconst_1 9 invokestatic #5 <Method void exit(int)> 12 return 4 November 2003 CS 201J Fall 2003 27 Referencing Memory • iload <varnum> – Pushes the int in local variable <varnum> (1 bytes) on the stack • istore <varnum> – Pops the int on the top of the stack and stores it in local variable <varnum> 4 November 2003 CS 201J Fall 2003 28 Referencing Example Method void main(java.lang.String[]) 0 iconst_2 public class Locals1 { 1 istore_1 static public void main (String args[]) { 2 iconst_3 int a = 2; 3 istore_2 int b = 3; 4 iload_1 int c = a + b; 5 iload_2 6 iadd System.err.println ("c: " + c); } } 7 istore_3 8 getstatic #2 <Field java.io.PrintStream err> 11 new #3 <Class java.lang.StringBuffer> 14 dup 15 invokespecial #4 <Method java.lang.StringBuffer()> 18 ldc #5 <String "c: "> 20 invokevirtual #6 <Method java.lang.StringBuffer append(java.l 23 iload_3 24 invokevirtual #7 <Method java.lang.StringBuffer append(int)> 27 invokevirtual #8 <Method java.lang.String toString()> 30 invokevirtual #9 <Method void println(java.lang.String)> 33 return 4 November 2003 CS 201J Fall 2003 29 Charge • PS6 will involve reading and writing Java byte codes • Use javap –c <classname> to look at what the javac compiler produces for your code • Thursday: what would this program do? Method void main(java.lang.String[]) 0 iconst_2 1 iadd 2 return 4 November 2003 CS 201J Fall 2003 30