Compiler Construction Project Prof. Dr. Hanspeter Mössenböck University of Linz A-4040 Linz moessenboeck@ssw.uni-linz.ac.at In this project you will write a small compiler for a Java-like language (MicroJava). You will learn how to apply the knowledge from the compiler construction module in practice and study all the details involved in a real compiler implementation. The project consists of three levels: Level 1 requires you to implement a scanner and a parser for the language MicroJava, specified in Appendix A of this document. If you are more ambitious, you can try to implement also level 2, which deals with symbol table handling and type checking. If you want to go to full length with your compiler you should also implement level 3, which deals with code generation for the MicroJava Virtual Machine specified in Appendix B of this document. This level is (more or less) optional so that you can get a good mark even if you do not implement it. The marking will be as follows: class test up to 45 points project level 1 + 25 points project level 2 + 25 points project level 3 + 5 points 100 points The project should be implemented in Java using Sun Microsystem's Java Development Kit (JDK, http://java.sun.com/j2se/1.5.0/index.jsp) or some other development environment. Before you start with the implementation you should study the specification of the language MicroJava and the MicroJava Virtual Machine that executes the bytecodes (i.e. the machine program) generated from a MicroJava source program. Level 1: Scanning and Parsing In this part of the project you will implement a scanner and a recursive descent parser for MicroJava. Start with the implementation of the scanner and write a test program that repeatedly requests the next input token from the scanner. The test program should demonstrate that your scanner returns the correct tokens for a sample program (you can use the sample program in the MicroJava specification). Next you should write a recursive descent parser that uses your scanner to read the input tokens. In the first step, your parser should be implemented without error handling. If it finds an error it should report it and just terminate. Write a test program that uses your parser to analyse the sample MicroJava program. In the second step you should augment your parser with error handling. Test your parser with various sample programs that contain syntax errors. All classes of the compiler should belong to a package MJ. Scanning Study the specification of MicroJava carefully. What are the tokens of the MicroJava grammar? What is the format of names, numbers, character constants and comments? What keywords and predeclared names do you need? The scanner should be implemented as a file Scanner.java and should have the following interface: package MJ; import ...; public class Scanner { // token codes public static final int none = 0, ident = 1, number = 2, charCon = 3, ... if_ = 33, // "if" cannot be used as a name because it is a keyword new_ = 34, ... eof = 41; // static variables private static Reader in; private static char ch; public static int col; public static int line; // source file reader // lookahead character // current column // current line private static void error(String msg) { ... } // print error message public static void init(Reader r) { ... } // initialize scanner public static Token next() { ... } // return next input token } The main method of the scanner is next() which is repeatedly called by the parser (or by your test program) and delivers the next input token on every call. The type Token should be implemented in a file Token.java and is specified as follows: package MJ; public class Token { public int kind; public int line; public int col; public int val; public String string; } // token kind // token line // token column // token value (for number and charConst) // token string Every call of next() returns the next input token. If a symbol is not a valid token (e.g., & or $) next() should return the token value none. At the end of the input stream next() should return the token value eof. Your scanner should skip blanks, end of line characters, tabulator characters and comments. The following situations represent lexical errors: The occurrence of an invalid character (e.g., $) Character constants with the following characteristics: - a missing quote at the end of the character constant ('x) - an empty character constant ('') Integer constants that are too large. The range of int is -2147483648..2147483647. Note that the scanner only recognizes positive constants (i.e., 0 to 2147483647). A negative number such as -3 is delivered as two distinct tokens for - and for 3. In these situations next() should report an error and return a token with the value none. At http://www.ssw.uni-linz.ac.at/Misc/CC/ you will find a fragment of the scanner that you can use as a starting point of your implementation. In order to test your scanner, write a test program that repeatedly calls Scanner.next() and dumps the information returned by this method. For the program program Nonsense int dummy; { int Nonsense() { dummy=0; return dummy; } void main() { dummy= Nonsense(); } } the test program should produce the following output: line 1, col 1: line 1, col 7: line 2, col 3: line 2, col 7: line 2, col 12: line 3, col 1: line 4, col 3: line 4, col 7: line 4, col 15: line 4, col 16: line 4, col 18: line 4, col 20: line 4, col 25: line 4, col 26: ... program_ ident ident ident semicolon lbrace ident ident lpar rpar lbrace ident assign number Nonsense int dummy int Nonsense dummy 0 Apply your test program also to the sample program in the MicroJava specification and to a program containing lexical errors. Parsing The next step is to implement a recursive descent parser that should be implemented as a class Parser in the file Parser.java. Every production of the MicroJava grammar should be implemented as a static method of class Parser. package MJ; public class Parser { private static Token t; // most recently recognized token private static Token la; // lookahead token (still unrecognized) private static int sym; // most recent token number (always holds la.kind) public static void parse() {...} // starts the syntax analysis private static void scan() {...} // gets a token from the scanner and stores it in sy private static void check(int expected) {...} // tries to recognize the token "expected" private static void error(String msg) {...} // reports a syntax error private static void Program() {...} ... private static void Mulop() {...} // parses the production for Program // parses the production for Mulop } The parser is started by calling the method parse(). It assumes that the scanner has already been initialized; it requests the first token from the scanner and calls the method for parsing the production of the start symbol of the MicroJava grammar (Program). The methods scan(), check() and error() are as discussed in the lecture. Your first version of the parser should not try to recover from syntax errors. The method error() should simply report an error with its position (line and column) and a meaningful error message and should then terminate the program. In addition to the parser you should also implement a main program (Compiler.java) that reads the name of the source file to be compiled, initializes the scanner and calls the parser. package MJ; public class Compiler { public static void main(String [] arg) {...} } Test your parser using the sample program of the MicroJava specification. You should also test it with programs that contain syntax errors in order to see if the parser correctly reports the errors. Syntax Error Handling The final step of level 1 is to augment your parser with syntax error handling and recovery. Use the method of special anchors discussed in the lecture. In case of an error the parser should report it (calling error()) and continue parsing until it gets to the next synchronisation point where it should read and skip input tokens until it encounters a token that is a valid anchor. With this anchor the parser can continue. In order not to produce spurious error messages you should use the heuristics of a minimal error distance as discussed in the lecture. The parser should also count the number of detected errors and make it accessible to the main program. At the end of the compilation the main program should print a message with the number of errors that were detected. Test your error handler by inserting errors into a MicroJava program and by feeding it to the parser. Try to find out how many errors your parser can report in a single run. Level 2: Symbol Table Handling and Type Checking The parser can now check if a program is syntactically correct. In order to detect semantic errors, however, the compiler needs a symbol table in which it stores information about all declared names. This table is used for checking context conditions in the grammar. Implement a symbol table as discussed in the lecture. It should use a class Obj to store information about declared names, a class Struct for type information, and a class Scope for maintaining nested scopes. The symbol table itself should be implemented as a class Tab with proper methods to insert and find names as well as to open and close scopes. package MJ.SymTab; public class Obj { public static final int // object kinds Con = 0, Var = 1, Type = 2, Meth = 3, Prog = 4; public int kind; // Con, Var, Typ, Fld, Meth, Prog public String name; public Struct type; public Obj next; // to the next Obj in this scope public int val; // Con: constant value public int adr; // Var, Meth: address public int level; // Var: declaration level public int nPars; // Meth: no. of parameters public Obj locals; // Meth: to the local variables of this method } package MJ.SymTab; public class Struct { public static final int // structure kinds None = 0, Int = 1, Char = 2, Arr = 3, Class = 4; public int kind; // kind of this type (None, Int , Char, Class, Arr) public Struct elemType; // Arr: element type public int n; // Class: number of fields public Obj fields; // Class: list of fields } package MJ.SymTab; public class Scope { public Scope outer; public Obj locals; public Int nVars; } // to the enclosing scope // to the objects of this scope // number of variables in this scope package MJ.SymTab; public class Tab { public static Scope topScope; // current scope public static Obj noObj, chrObj, ...; // predeclared objects public static Struct intType, charType, ...; // predeclared types public static void closeScope(); public static void openScope(); public static void init(); public static Obj insert(int kind, String name, Struct type); public static Obj find(String name); public static Obj findField(String name, Struct type); } Every class should be implemented in a separate file (Obj.java, Struct.java, etc.) in a subpackage MJ.SymTab (i.e. in a subdirectory MJ/SymTab). The methods of class Tab have the following meaning: insert() creates a new Obj, initializes it with the parameter values, and adds it to the current scope. If this scope already contains an object with the same name an error should be reported. find() looks up a name in all open scopes starting at the current (i.e. innermost) scope topScope and returns the Obj node with this name. If the name was not found find() should report an error and return the predeclared value noObj. findField() looks up a name in the field list of the specified class type and returns the Obj node with this name. If the name was not found findField() should report an error and return the predeclared value noObj. init() initializes the symbol table, in particular it sets up the data structures for the predeclared objects and types (i.e. the universe) as shown in the lecture. Extend your parser so that it calls the methods of class Tab to create Obj, Struct and Scope nodes for the declarations of the compiled program. In order to test if the symbol table has been built correctly, implement auxiliary methods for class Tab that dump the contents of the symbol table. When you have checked that the symbol table is correct, extend your parser again to check the context conditions described in the MicroJava specification (Appendix A). For that you have to retrieve the information from the symbol table using the methods find() and findField(). Level 3: Code Generation The final task is to generate code for the MicroJava Virtual Machine. Before you start, carefully study the specification of the VM (Appendix B) in order to become familiar with the run time data structures, the addressing modes, and the instructions. All classes of the code generator should be implemented as separate files in the package MJ.CodeGen. At http://www.ssw.uni-linz.ac.at/Misc/CC/ you will find a fragment of Code.java that you can use as a starting point of your code generator. Its interface is as follows: package MJ.CodeGen; public class Code { public static final int // instruction codes load = 1, load_n = 2, ...; public static final int // compare operators eq = 0, ne = 1, ...; private static int[] inverse = {ne, eq, ge, gt, le, lt}; private static byte[] buf; public static int pc; public static int mainPc; public static int dataSize; // code buffer // next free byte in the code buffer // pc of main function (set by the parser) // length of static data in words (set by the parser) //--------------- code buffer access ---------------------public static void put(int x) {...} public static void put2(int x) {...} public static void put2(int pos, int x) {...} public static void put4(int x) {...} public static int get(int pos) {...} public static int get2(int pos) {...} private static void error(String msg) {...} //----------------- instruction generation -------------public static void init() {...} // initialize the code buffer public static Item load(Item x) {...} // load x on the expression stack public static void loadConst(int n) {...} public static void assign(Item x, Item y) {...} public static void inc(Item x, int n) {...} public static void write(OutputStream s) {...} // load constant n on the expression stack // generate code for the assignment x = y // generate code to increment x by n // write the code buffer to the output stream //------------- jumps --------------public static void jump(Label lab) {...} public static void tJump(Item x) {...} public static void fJump(Item x) {...} // unconditional jump // true jump // false jump } For maintaining Items implement a class Item as discussed in the lecture. Its interface should look like this: package MJ.CodeGen; public class Item { public static final int // item kinds Con = 0, Local =1, Static = 2, Stack = 3, Fld = 4, Elem = 5, Meth = 6, Cond = 7; public int kind; // Con, Local, Static, Stack, Fld, Elem, Meth, Cond public Struct type; // item type public Obj obj; // Meth: method object from the symbol table public int val; // Con: constant value public int adr; // Local, Static, Fld, Meth: address public int op; // Cond: operator public Label tLabel, fLabel; // Cond: true jumps and false jumps public Item(Obj o) {...} public Item(int kind, int adr, Struct typ) {...} } For maintaining labels and jumps implement a class Label as discussed in the lecture. Its interface should look like this: package MJ.CodeGen; public class Label { private boolean defined; private int adr; // target address already defined? // target address or start of threading list public Label() {...} public void put() {...} public void here() {...} } Extend your parser step by step so that it calls methods to create items and labels and to emit instructions. You should implement the code generation for the various language constructs in the following order: selectors (i.e. obj.f, arr[i]) expressions assignments while statements, if statements, break statements conditional boolean expressions method calls and parameter passing In order to check your generated code you can use a Decoder class available from http://www.ssw.uni-linz.ac.at/Misc/CC/. Its interface is as follows: package MJ.CodeGen; public class Decoder { public static void decode(byte[] c, int off, int len) {...} } On the Web page you will also find an implementation of the MicroJava Virtual Machine that can be used to execute the programs generated by your compiler. You have to download the file Run.java and compile it. The interpreter can be started by the command java MJ.Run objectFile [-DEBUG] The object file must conform to the format of the MicroJava VM specification (the method Code.write() in the fragment of the code generator produces exactly that format). The option -DEBUG causes a trace of the interpretation to be printed on the screen. Appendix A. The MicroJava Language This section describes the MicroJava language that is used in the practical part of the compiler construction module. MicroJava is similar to Java but much simpler. A.1 General Characteristics A MicroJava program consists of a single program file with static fields and static methods. There are no external classes but only inner classes that can be used as data types. The main method of a MicroJava program is always called main(). When a MicroJava program is called this method is executed. There are - Constants of type int (e.g. 3) and char (e.g. 'x') but no string constants. - Variables: all variables of the program are static. - Primitive types: int, char (Ascii) - Reference types: onedimensional arrays like in Java as wellas classes with fields but without methods. - Static methods in the main class. There is no garbage collector (allocated objects are only deallocated when the program ends). Predeclared procedures are ord, chr, len. Sample program program P final int size = 10; class Table { int[] pos; int[] neg; } Table val; { void main() int x, i; { /*---------- Initialize val ------------*/ val = new Table; val.pos = new int[size]; val.neg = new int[size]; i = 0; while (i < size) { val.pos[i] = 0; val.neg[i] = 0; i++; } /*---------- Read values ---------*/ read(x); while (x != 0) { if (0 <= x && x < size) { val.pos[x]++; } else if (-size < x && x < 0) { val.neg[-x]++; } read(x); } } } A.2 Syntax Program = "program" ident {ConstDecl | VarDecl | ClassDecl} "{" {MethodDecl} "}". ConstDecl VarDecl ClassDecl MethodDecl FormPars Type = "final" Type ident "=" (number | charConst) ";". = Type ident {"," ident } ";". = "class" ident "{" {VarDecl} "}". = (Type | "void") ident "(" [FormPars] ")" {VarDecl} Block. = Type ident {"," Type ident}. = ident ["[" "]"]. Block Statement ActPars = "{" {Statement} "}". = Designator ("=" Expr | "(" [ActPars] ")" | "++" | "--") ";" | "if" "(" Condition ")" Statement ["else" Statement] | "while" "(" Condition ")" Statement | "break" ";" | "return" [Expr] ";" | "read" "(" Designator ")" ";" | "print" "(" Expr ["," number] ")" ";" | Block | ";". = Expr {"," Expr}. Condition CondTerm CondFact Relop = CondTerm {"||" CondTerm}. = CondFact {"&&" CondFact}. = Expr Relop Expr. = "==" | "!=" | ">" | ">=" | "<" | "<=". Expr Term Factor = ["-"] Term {Addop Term}. = Factor {Mulop Factor}. = Designator ["(" [ActPars] ")"] | number | charConst | "new" ident ["[" Expr "]"] | "(" Expr ")". = ident {"." ident | "[" Expr "]"}. = "+" | "-". Designator Addop Mulop = "*" | "/" | "%". Lexical structure Terminal classes: ident = letter {letter | digit | "_"}. number = digit {digit}. charConst = "'" char "'". // including '\r' and '\n' Keywords: program class if else void final while new read print return break + - * / != || ) > >= % < ++ <= -- == && ( [ ] { } = ; , . Operators: Comments: // to the end of line A.3 Semantics All terms in this document that have a definition are underlined to emphasize their special meaning. The definitions of these terms are given here. Reference type Arrays and classes are called reference types. Type of a constant The type of an integer constant (e.g. 17) is int. The type of a character constant (e.g. 'x') is char. Same type Two types are the same if they are denoted by the same type name, or if both types are arrays and their element types are the same. Type compatibility Two types are compatible if they are the same, or if one of them is a reference type and the other is the type of null. Assignment compatibility A type src is assignment compatible with a type dst if src and dst are the same, or if dst is a reference type and src is the type of null. Predeclared names int the type of all integer values char the type of all character values null the null value of a class or array variable, meaning "pointing to no value" chr standard method; chr(i) converts the int expression i into a char value ord standard method; ord(ch) converts the char value ch into an int value len standard method; len(a) returns the number of elements of the array a Scope A scope is the textual range of a method or a class. It extends from the point after the declaring method or class name to the closing curly bracket of the method or class declaration. A scope excludes other scopes that are nested within it. We assume that there is an (artificial) outermost scope, to which the main class is local and which contains all predeclared names. The declaration of a name in an inner scope S hides the declarations of the same name in outer scopes. Note Indirect recursion is not allowed, since every name must be declared before it is used. This would not be possible if indirect recursion were allowed. A predeclared name (e.g. int or char) can be redeclared in an inner scope (but this is not recommended). A.4 Context Conditions General context conditions Every name must be declared before it is used. A name must not be declared twice in the same scope. A program must contain a method named main. It must be declared with a void function type and must not have parameters. Context conditions for standard methods chr(e) e must be an expression of type int. ord(c) c must be of type char. len(a) a must be an array. Context conditions for the MicroJava productions Program = "program" ident {ConstDecl | VarDecl | ClassDecl} "{" {MethodDecl} "}". ConstDecl = "final" Type ident "=" (number | charConst) ";". The type of number or charConst must be the same as the type of Type. VarDecl = Type ident ["[" "]"] {"," ident ["[" "]"]} ";". ClassDecl = "class" ident "{" {VarDecl} "}". MethodDecl = (Type | "void") ident "(" [FormPars] ")" {VarDecl} "{" {Statement} "}". If a method is a function it must be left via a return statement (this is checked at run time). FormPars = Type ident ["[" "]"] {"," Type ident ["[" "]"]}. Type = ident. ident must denote a type. Statement = Designator "=" Expr ";". Designator must denote a variable, an array element or an object field. The type of Expr must be assignment compatible with the type of Designator. Statement = Designator ("++" | "--") ";". Designator must denote a variable, an array element or an object field. Designator must be of type int. Statement = Designator "(" [ActPars] ")" ";". Designator must denote a method. Statement = "break". The break statement must be contained in a while statement. Statement = "read" "(" Designator ")" ";". Designator must denote a variable, an array element or an object field. Designator must be of type int or char. Statement = "print" "(" Expr ["," number] ")" ";". Expr must be of type int or char. Statement = "return" [Expr] . The type of Expr must be assignment compatible with the function type of the current method. If Expr is missing the current method must be declared as void. Statement = | | | "if" "(" Condition ")" Statement ["else" Statement] "while" "(" Condition ")" Statement "{" {Statement} "}" ";". ActPars = Expr {"," Expr}. The numbers of actual and formal parameters must match. The type of every actual parameter must be assignment compatible with the type of every formal parameter at corresponding positions. Condition = CondTerm {"||" CondTerm}. CondTerm = CondFact {"&&" CondFact}. CondFact = Expr Relop Expr. The types of both expressions must be compatible. Classes and arrays can only be checked for equality or inequality. Expr = Term. Expr = "-"Term. Term must be of type int. Expr = Expr Addop Term. Expr and Term must be of type int. Term = Factor. Term = Term Mulop Factor. Term and Factor must be of type int. Factor = Designator | number | charConst| "(" Expr ")". Factor = Designator "(" [ActPars] ")". Designator must denote a method. Factor = "new" Type . Type must denote a class. Factor = "new" Type "[" Expr "]". The type of Expr must be int. Designator = Designator "." ident . The type of Designator must be a class. ident must be a field of Designator. Designator = Designator "[" Expr "]". The type of Designator must be an array. The type of Expr must be int. Relop = "==" | "!=" | ">" | ">=" | "<" | "<=". Addop = "+" | "-". Mulop = "*" | "/" | "%". A.5 Implementation Restrictions There must not be more than 256 local variables. There must not be more than 65536 global variables. A class must not have more than 65536 fields. The code of the program must not be longer than 8 KBytes. Appendix B. The MicroJava VM This section describes the architecture of the MicroJava Virtual Machine that is used in the practical part of this compiler construction module. The MicroJava VM is similar to the Java VM but has less instructions. Some instructions were also simplified. Whereas the Java VM uses operand names from the constant pool that are resolved by the loader, the MicroJava VM uses fixed operand addresses. Java instructions encode the types of their operands so that a verifyer can check the consistency of an object file. MicroJava instructions do not encode operand types. B.1 Memory Layout The memory areas of the MicroJava VM are as follows. code data heap pstack estack esp pc free fp ra dl sp code (byte array) data (word array) heap (word array) estack (word array) pstack (word array) code This area contains the code of the methods. The register pc contains the index of the currently executed instruction. mainpc contains the start address of the method main(). data This area holds the (static or global) data of the main program. It is an array of variables. Every variable occupies one word (32 bits). The addresses of the variables are indexes into the array. heap This area holds the dynamically allocated objects and arrays. The blocks are allocated consecutively. free points to the beginning of the still unused area of the heap. Dynamically allocated memory is only returned at the end of the program. There is no garbage collector. All object fields occupy a single word (32 bits). Arrays of char elements are byte arrays. Their length is a multiple of 4. Pointers are byte offsets into the heap. Array objects start with an invisible word, containing the array length. pstack In this area (the procedure stack) maintains the activation frames of the invoked methods. Every frame consists of an array of local variables, each occupying a single word (32 bits). Their addresses are indexes into the array. ra is the return address of the method, dl is the dynamic link (a pointer to the frame of the caller). A newly allocated frame is initialized with all zeroes. estack This area (the expression stack) is used to store the operands of the instructions. After every MicroJava statement estack is empty. Method parameters are passed on the expression stack and are removed by the Enter instruction of the invoked method. The expression stack is also used to pass the return value of the method back to the caller. All data (global variables, local variables, heap variables) are initialized with a null value (0 for int, chr(0) for char, null for references). B.2 Instruction Set The following tables show the instructions of the MicroJava VM together with their encoding and their behaviour. The third column of the tables show the contents of estack before and after every instruction, for example ..., val, val ..., val means that this instruction removes two words from estack and pushes a new word onto it. The operands of the instructions have the following meaning: b s w a byte a short int (16 bits) a word (32 bits) Variables of type char are stored in the lowest byte of a word and are manipulated with word instructions (e.g. load, store). Array elements of type char are stored in a byte array and are loaded and stored with special instructions. Loading and storing of local variables 1 load b ... ..., val Load push(local[b]); 2..5 load_n ... ..., val Load (n = 0..3) push(local[n]); 6 store b ..., val ... Store local[b] = pop(); 7..10 store_n ..., val ... Store (n = 0..3) local[n] = pop(); Loading and storing of global variables 11 getstatic s ... ..., val Load static variable push(data[s]); 12 putstatic s ..., val ... Store static variable data[s] = pop(); Loading and storing of object fields 13 getfield s ..., adr ..., val Load object field adr = pop()/4; push(heap[adr+s]); 14 putfield s ..., adr, val ... Store object field val = pop(); adr = pop()/4; heap[adr+s] = val; Loading of constants 15..20 const_n ... ..., val Load constant (n = 0..5) push(n); 21 const_m1 ... ..., -1 Load minus one push(-1); 22 const w ... ..., val Load constant push(w); Arithmetic 23 add ..., val1, val2 ..., val1+val2 Add push(pop() + pop()); 24 sub ..., val1, val2 ..., val1-val2 Subtract push(-pop() + pop()); 25 mul ..., val1, val2 ..., val1*val2 Multiply push(pop() * pop()); 26 div ..., val1, val2 ..., val1/val2 Divide x = pop(); push(pop() / x); 27 rem ..., val1, val2 ..., val1%val2 Remainder x = pop(); push(pop() % x); 28 neg ..., val ..., - val Negate push(-pop()); 29 shl ..., val, x ..., val1 Shift left x = pop(); push(pop() << x); 30 shr ..., val, x ..., val1 Shift right (arithmetically) x = pop(); push(pop() >> x); 31 inc b1, b2 ... ... Increment variable local[b1] = local[b1] + b2; Object creation 32 new s ... ..., adr New object allocate area of s bytes; initialize area to all 0; push(adr(area)); 33 newarray b ..., n ..., adr New array n = pop(); if (b==0) alloc. array with n elems of byte size; else if (b==1) alloc. array with n elems of word size; initialize array to all 0; push(adr(array)) Array access 34 aload ..., adr, i ..., val Load array element i = pop(); adr = pop()/4+1; push(heap[adr+i]); 35 astore ..., adr, i, val ... Store array element val = pop(); i = pop(); adr = pop()/4+1; heap[adr+i] = val; 36 baload ..., adr, i ..., val Load byte array element i = pop(); adr = pop()/4+1; x = heap[adr+i/4]; push(byte i%4 of x); 37 bastore ..., adr, i, val ... Store byte array element val = pop(); i = pop(); adr = pop()/4+1; x = heap[adr+i/4]; set byte i%4 in x; heap[adr+i/4] = x; 38 arraylength ..., adr ..., len Get array length adr = pop(); push(heap[adr]); Stack manipulation 39 pop ..., val ... Remove topmost stack element dummy = pop(); 40 dup ..., val ..., val, val Duplicate topmost stack element x = pop(); push(x); push(x); 41 dup2 ..., v1, v2 Duplicate top two stack elements ..., v1, v2, v1, v2 y = pop(); x = pop(); push(x); push(y); push(x); push(y); Jumps (jump distance relative to the beginning of the jump instruction) 42 jmp s 43..48 j<cond> s Jump unconditionally pc = pc + s; ..., x, y ... Jump conditionally (eq, ne, lt, le, gt, ge) y = pop(); x = pop(); if (x cond y) pc = pc + s; Method call (PUSH and POP work on pstack) 49 call s Call method PUSH(pc+3); pc := pc + s; 50 return Return pc = POP(); 51 enter b1, b2 Enter method psize = b1; lsize = b2; // in words PUSH(fp); fp = sp; sp = sp + lsize; initialize frame to 0; for (i=psize-1;i>=0;i--) local[i] = pop(); 52 exit Exit method sp = fp; fp = POP(); Input/Output 53 read ... ..., val Read readInt(x); push(x); 54 print ..., val, width ... Print width = pop(); writeInt(pop(), width); 55 bread ... ..., val Read byte readChar(ch); push(ch); 56 bprint ..., val, width ... Print byte width = pop(); writeChar(pop(), width); Miscellaneous 57 trap b Generate run time error print error message depending on b; stop execution; B.3 Object File Format 2 bytes: "MJ" 4 bytes: code size in bytes 4 bytes: number of words for the global data 4 bytes: mainPC: the address of main() relative to the beginning of the code area n bytes: the code area (n = code size specified in the header) B.4 Run Time Errors 1 Missing return statement in function.