Language Implementation Overview John Keyser Spring 2016 CSCE 314 – Programming Studio Implementing a Programming Language – How to Undo the Abstraction Source program Optimizer I/O Lexer Parser Type checker Code generator Machine code JIT Bytecode Interpreter Virtual machine I/O I/O Spring 2016 Machine CSCE 314 – Programming Studio Implementing a Programming Language – How to Undo the Abstraction Source program Optimizer I/O Lexer Parser Type checker Code generator Machine code JIT Bytecode Interpreter Virtual machine I/O I/O Spring 2016 Machine CSCE 314 – Programming Studio Lexical Analysis From a stream of characters if (a == b) return; to a stream of tokens keyword[‘if‘] symbol[‘(‘] identifier[‘a‘] symbol[‘==‘] identifier[‘b‘] symbol[‘)‘] keyword[‘return‘] symbol[‘;‘] Spring 2016 CSCE 314 – Programming Studio Implementing a Programming Language – How to Undo the Abstraction Source program Optimizer I/O Lexer Parser Type checker Code generator Machine code JIT Bytecode Interpreter Virtual machine I/O I/O Spring 2016 Machine CSCE 314 – Programming Studio Syntactic Analysis (Parsing) From a stream of characters to a syntax tree (parse tree) if-statement if (a == b) return; to a stream of tokens keyword[‘if‘] symbol[‘(‘] expression statement equality operator return stmt identifier[‘a‘] symbol[‘==‘] identifier[‘b‘] identifier identifier a b symbol[‘)‘] keyword[‘return‘] symbol[‘;‘] Spring 2016 CSCE 314 – Programming Studio Implementing a Programming Language – How to Undo the Abstraction Source program Optimizer I/O Lexer Parser Type checker Code generator Machine code JIT Bytecode Interpreter Virtual machine I/O I/O Spring 2016 Machine CSCE 314 – Programming Studio Type Checking if (a == b) return; if-statement : OK statement : OK Annotate syntax tree expression : bool with types, check that types are used equality operator : correctly integer equality identifier : int a Spring 2016 return stmt : void identifier : int b CSCE 314 – Programming Studio Implementing a Programming Language – How to Undo the Abstraction Source program Optimizer I/O Lexer Parser Type checker Code generator Machine code JIT Bytecode Interpreter Virtual machine I/O I/O Spring 2016 Machine CSCE 314 – Programming Studio Optimization int a = 10; int b = 20 – a; if (a == b) return; if-statement : OK if-statement : OK statement : OK expression : bool equality operator : integer equality identifier : int a Constant propagation can deduce that always a==b, allowing the optimizer to transform the tree: return stmt : void statement : OK constant : bool return stmt : void identifier : int b true Spring 2016 return stmt : void CSCE 314 – Programming Studio Implementing a Programming Language – How to Undo the Abstraction Source program Optimizer I/O Lexer Parser Type checker Code generator Machine code JIT Bytecode Interpreter Virtual machine I/O I/O Spring 2016 Machine CSCE 314 – Programming Studio Code Generation Code generation is essentially undoing abstractions, until code is executable by some target machine: Variables become memory locations Variable names become addresses to memory locations Expressions become loads of memory locations to registers, register operations, and stores back to memory Abstract data types etc. disappear. What is left is data types directly supported by the machine such as integers, bytes, floating point numbers, etc. Control structures become jumps and conditional jumps to labels (essentially goto statements) Spring 2016 CSCE 314 – Programming Studio Phases of Compilation/Execution Characterized by Errors Detected • Lexical analysis: 5abc a === b • Syntactic analysis: if + then; int f(int a]; • Type checking: void f(); int a; a + f(); • Execution time: int a[100]; a[101] = 5; Spring 2016