CS 320: Compiling Techniques David Walker

advertisement
CS 320: Compiling Techniques
David Walker
People

David Walker (Professor)




412 Computer Science Building
dpw@cs.princeton.edu
office hours: after each class
Dan Dantas (TA)



417 Computer Science Building
ddantas@cs.princeton.edu
office hours: Mondays 2-3 PM
Information

Web site:


www.cs.princeton.edu/courses/archive/spri
ng04/cos320/index.htm
Mailing list:
Books

Modern Compiler Implementation in ML



Andrew Appel
required
Elements of ML Programming


Jeffrey D. Ullman
also: online references; see Web site
Assignment 0




Write your name and other information on
the sheet circulating
Find, skim and bookmark the course web
pages
Subscribe to course e-mail list
Begin assignment 1



Read chapter 1 Appel
Figure out how to run & use SML
Due next Thursday 12
onward!
What is a compiler?

A compiler is program that translates a
source language into an equivalent
target language
What is a compiler?
while (i > 3) {
a[i] = b[i];
i ++
}
C program
compiler
does this
mov eax, ebx
add eax, 1
cmp eax, 3
jcc eax, edx
assembly
program
What is a compiler?
class foo {
int bar;
...
}
Java program
compiler
does this
struct foo {
int bar;
...
}
C program
What is a compiler?
class foo {
int bar;
...
}
Java program
compiler
does this
........
.........
........
Java virtual
machine program
What is a compiler?
\newcommand{
....
}
Latex program
compiler
does this
\sfd\sf\fadg
Tex program
What is a compiler?
\newcommand{
....
}
Tex program
compiler
does this
\sfd\sf\fadg
Postscript program
What is a compiler?

Other places:




Web scripts are compiled into HTML
assembly language is compiled into
machine language
hardware description language is compiled
into a hardware circuit
...
Compilers are complex
front-end

text file to abstract syntax


middle-end
abstract syntax to
intermediate form (IR)

back-end

lexing; parsing
analysis; optimizations;
data layout
IR to machine code

code generation; register
allocation
Course project
front-end

Tiger Source Language


middle-end
Instruction Trees as
intermediate form (IR)

back-end

simple imperative language
type checking; data layout on
the stack
Code Generation

instruction selection
algorithms; register allocation
via graph coloring
Standard ML


Standard ML is a domain-specific
language for building compilers
Support for




Complex data structures (abstract syntax,
compiler intermediate forms)
Memory management like Java
Large projects with many modules
Advanced type system for error detection
Introduction to ML


You will be responsible for learning ML
on your own.
Today I will cover some basics

Resources:



Jeffrey Ullman “Elements of ML Programming”
Robert Harper’s “an introduction to ML”
See course webpage for pointers and info
about how to get the software
Intro to ML

Highlights

Data Structures for compilers



Strongly-typed language




Data type definitions
Pattern matching
Every expression has a type
Certain errors cannot occur
Polymorphic types provide flexibility
Flexible Module System


Abstract Types
Higher-order modules (functors)
Intro to ML

Interactive Language




Type in expressions
Evaluate and print type and result
Compiler as well
High-level programming features




Data types
Pattern matching
Exceptions
Mutable data discouraged
Preliminaries

Read – Eval – Print – Loop
- 3 + 2;
Preliminaries

Read – Eval – Print – Loop
- 3 + 2;
> 5: int
Preliminaries

Read – Eval – Print – Loop
- 3 + 2;
> 5: int
- it + 7;
> 12 : int
Preliminaries

Read – Eval – Print – Loop
- 3 + 2;
> 5: int
- it + 7;
> 12 : int
- it – 3;
> 9 : int
- 4 + true;
stdIn:17.1-17.9 Error: operator and operand don't
agree [literal]
operator domain: int * int
operand:
int * bool
in expression:
4 + true
Preliminaries

Read – Eval – Print – Loop
- 3 div 0;
Failure : Div
- run-time error
Basic Values
- ();
> () : unit
=> like “void” in C (sort of)
=> the uninteresting value/type
- true;
> true : bool
- false;
> false : bool
- if it then 3+2 else 7;
> 7 : int
- false andalso loop_Forever;
> false : bool
“else” clause is always necessary
and also, or else short-circuit eval
Basic Values
Integers
-3+2
> 5 : int
- 3 + (if not true then 5 else 7);
> 10 : int
Strings
- “Dave” ^ “ “ ^ “Walker”;
> “Dave Walker” : string
- print “foo\n”;
foo
> 3 : int
Reals
- 3.14;
> 3.14 : real
No division between expressions
and statements
Using SML/NJ
Interactive mode is a good way to start
learning and to debug programs, but…
 Type in a series of declarations into a
“.sml” file
- use “foo.sml”
[opening foo.sml]
list of declarations
…
with their types

Larger Projects


SML has its own built in interactive
“make”
Pros:



It automatically does the dependency
analysis for you
No crazy makefile syntax to learn
Cons:

May be more difficult to interact with other
languages or tools
Compilation Manager
sources.cm
Group is
a.sig
b.sml
c.sml
a.sig
b.sml
c.sml
% sml
- OS.FileSys.chDir “~/courses/510/a2”;
- CM.make();
looks for “sources.cm”, analyzes dependencies
[compiling…]
compiles files in group
[wrote…]
saves binaries in ./CM/
- CM.make’ “myproj/”();
specify directory
What is next?

ML has a rich set of structured values







Tuples: (17, true, “stuff”)
Records: {name = “Dave”, ssn = 332177}
Lists: 3::4::5::nil or [3,4]@[5]
Datatypes
Functions
And more!
Rather than list all the details, we will
write a couple of programs
An interpreter

Interpreters are usually implemented as
a series of transformers:
lexing/
parsing
stream of
characters
evaluate
abstract
syntax
print
abstract
value
stream of
characters
A little language (LL)

An arithmetic expression e is





a boolean value
an if statement (if e1 then e2 else e3)
an integer
an add operation
a test for zero (isZero e)
LL abstract syntax in ML
datatype term =
Bool of bool
| If of term * term * term
| Num of int
| Add of term * term
| IsZero of term
vertical bar
separates alternatives
-- constructors
are capitalized
-- constructors
can take a single
argument of a
particular type
type of a tuple
another eg: string * char
LL abstract syntax in ML
Add
Add (Num 2, Num 3)
Num
represents the expression “2 + 3”
2
Num
3
LL abstract syntax in ML
If
If (Bool true,
Num 0,
Add (Num 2, Num 3))
represents
Bool Num
true
Add
Num Num
0
“if true then 0 else 2 + 3”
2
3
Function declarations
function name
fun isValue t =
case t of
Num n => true
| Bool b => true
| _ => false
default pattern matches anything
function parameter
What is the type of the
parameter t? Of the function?
function name
fun isValue t =
case t of
Num n => true
| Bool b => true
| _ => false
default pattern matches anything
function parameter
What is the type of the
parameter t? Of the function?
fun isValue (t:term) : bool =
case t of
Num n => true
| Bool b => true
| _ => false
val isValue : term -> bool
ML does type inference => you need not
annotate functions yourself (but it can be helpful)
A type error
fun isValue t =
case t of
Num _ => true
| _ => false
ex.sml:22.3-24.15 Error: types of rules don't agree [literal]
earlier rule(s): term -> int
this rule: term -> bool
in rule:
_ => false
A type error
Actually, ML will give you several errors in a row:
ex.sml:22.3-25.15 Error: types of rules don't agree [literal]
earlier rule(s): term -> int
this rule: term -> bool
in rule:
Successor t2 => true
ex.sml:22.3-25.15 Error: types of rules don't agree [literal]
earlier rule(s): term -> int
this rule: term -> bool
in rule:
_ => false
A very subtle error
fun isValue t =
case t of
num => true
| _ => false
The code above type checks. But when
we test it refined the function always returns “true.”
What has gone wrong?
A very subtle error
fun isValue t =
case t of
Num 0 => 1
| Add(Num t1,Num t2) => t1 + t2
| _ => 0
The code above type checks. But when
we test it refined the function always returns “true.”
What has gone wrong?
-- num is not capitalized (and has no argument)
-- ML treats it like a variable pattern (matches anything!)
Exceptions
exception Error of string
fun debug s : unit = raise (Error s)
Exceptions
exception Error of string
fun debug s : unit = raise (Error s)
in SML interpreter:
- debug "hello";
uncaught exception Error
raised at: ex.sml:15.28-15.35
Evaluator
fun isValue t = ...
exception NoRule
fun eval t =
case t of
Bool _ | Num _ => t
| ...
Evaluator
...
let statement
fun eval t =
for remembering
case t of
temporary
Bool _ | Num _ => t
results
| If(t1,t2,t3) =>
let val v = eval t1 in
case v of
Bool b => if b then (eval t2) else (eval t3)
| _ => raise NoRule
end
Evaluator
exception NoRule
fun eval1 t =
case t of
Bool _ | Num _ => ...
| ...
| Add (t1,t2) =>
case (eval v1, eval v2) of
(Num n1, Num n2) => Num (n1 + n2)
| (_,_) => raise NoRule
Finishing the Evaluator
fun eval1 t =
case t of
...
| ...
| Add (t1,t2) => ...
| IsZero t => ...
be sure your
case is
exhaustive
Finishing the Evaluator
fun eval1 t =
case t of
...
| ...
| Add (t1,t2) => ...
What if we
forgot a case?
Finishing the Evaluator
fun eval1 t =
case t of
...
| ...
| Add (t1,t2) => ...
What if we
forgot a case?
ex.sml:25.2-35.12 Warning: match nonexhaustive
(Bool _ | Zero) => ...
If (t1,t2,t3) => ...
Add (t1,t2) => ...
Download