Language Implementation
Overview
John Keyser
Spring 2016
CSCE 314 – Programming Studio
Implementing a Programming Language – How
to Undo the Abstraction
Source
program
Optimizer
I/O
Lexer
Parser
Type
checker
Code
generator
Machine code
JIT
Bytecode
Interpreter
Virtual machine
I/O
I/O
Spring 2016
Machine
CSCE 314 – Programming Studio
Implementing a Programming Language – How
to Undo the Abstraction
Source
program
Optimizer
I/O
Lexer
Parser
Type
checker
Code
generator
Machine code
JIT
Bytecode
Interpreter
Virtual machine
I/O
I/O
Spring 2016
Machine
CSCE 314 – Programming Studio
Lexical Analysis
From a stream of characters
if (a == b) return;
to a stream of tokens
keyword[‘if‘]
symbol[‘(‘]
identifier[‘a‘]
symbol[‘==‘]
identifier[‘b‘]
symbol[‘)‘]
keyword[‘return‘]
symbol[‘;‘]
Spring 2016
CSCE 314 – Programming Studio
Implementing a Programming Language – How
to Undo the Abstraction
Source
program
Optimizer
I/O
Lexer
Parser
Type
checker
Code
generator
Machine code
JIT
Bytecode
Interpreter
Virtual machine
I/O
I/O
Spring 2016
Machine
CSCE 314 – Programming Studio
Syntactic Analysis (Parsing)
From a stream of
characters
to a syntax tree (parse tree)
if-statement
if (a == b) return;
to a stream of tokens
keyword[‘if‘]
symbol[‘(‘]
expression
statement
equality operator
return stmt
identifier[‘a‘]
symbol[‘==‘]
identifier[‘b‘]
identifier
identifier
a
b
symbol[‘)‘]
keyword[‘return‘]
symbol[‘;‘]
Spring 2016
CSCE 314 – Programming Studio
Implementing a Programming Language – How
to Undo the Abstraction
Source
program
Optimizer
I/O
Lexer
Parser
Type
checker
Code
generator
Machine code
JIT
Bytecode
Interpreter
Virtual machine
I/O
I/O
Spring 2016
Machine
CSCE 314 – Programming Studio
Type Checking
if (a == b) return;
if-statement : OK
statement : OK
Annotate syntax tree expression : bool
with types, check that
types are used
equality operator :
correctly
integer equality
identifier : int
a
Spring 2016
return stmt : void
identifier : int
b
CSCE 314 – Programming Studio
Implementing a Programming Language – How
to Undo the Abstraction
Source
program
Optimizer
I/O
Lexer
Parser
Type
checker
Code
generator
Machine code
JIT
Bytecode
Interpreter
Virtual machine
I/O
I/O
Spring 2016
Machine
CSCE 314 – Programming Studio
Optimization
int a = 10;
int b = 20 – a;
if (a == b) return;
if-statement : OK
if-statement : OK
statement : OK
expression : bool
equality operator :
integer equality
identifier : int
a
Constant propagation can deduce
that always a==b, allowing the
optimizer to transform the tree:
return stmt : void
statement : OK
constant : bool
return stmt : void
identifier : int
b
true
Spring 2016
return stmt : void
CSCE 314 – Programming Studio
Implementing a Programming Language – How
to Undo the Abstraction
Source
program
Optimizer
I/O
Lexer
Parser
Type
checker
Code
generator
Machine code
JIT
Bytecode
Interpreter
Virtual machine
I/O
I/O
Spring 2016
Machine
CSCE 314 – Programming Studio
Code Generation
Code generation is essentially undoing abstractions,
until code is executable by some target machine:
Variables become memory locations
Variable names become addresses to memory
locations
Expressions become loads of memory locations to
registers, register operations, and stores back to
memory
Abstract data types etc. disappear. What is left is data
types directly supported by the machine such as
integers, bytes, floating point numbers, etc.
Control structures become jumps and conditional
jumps to labels (essentially goto statements)
Spring 2016
CSCE 314 – Programming Studio
Phases of Compilation/Execution
Characterized by Errors Detected
• Lexical analysis:
5abc
a === b
• Syntactic analysis:
if + then;
int f(int a];
• Type checking:
void f(); int a; a + f();
• Execution time:
int a[100]; a[101] = 5;
Spring 2016