Course & ML Intro, part 1

advertisement
CS 320: Compiling Techniques
David Walker
People

David Walker (Professor)




412 Computer Science Building
dpw@cs.princeton.edu
office hours: after each class
Limin Jia, Jay Ligatti (TAs)



418a Computer Science Building
ljia,jligatti@cs.princeton.edu
office hours:

Mondays & Wednesdays (we’ll send email to the email
list)
Information

Web site:


www.cs.princeton.edu/courses/archive/spri
ng05/cos320/index.htm
Mailing list:

To subscribe:


cos320-request@lists.cs.princeton.edu
To post to this list, send your email to:

cos320@lists.cs.princeton.edu
Books


Modern Compiler Implementation in ML,
Andrew Appel
A reference manual for SML

best choice: Online references


see course web site
several hardcopy books

Elements of ML Programming, Jeffrey D.
Ullman
Work

Assignments:







In class Midterm:


build your own compiler
approximately a module/week
40%
late penalty: 20%/day. Don’t be late!
ask questions of me, TAs, friends on course mailing list
turn in your own work
25%
Final during exam period:

35%
Assignment 0




Write your name and other information on
the sheet circulating
Find, skim and bookmark the course web
pages
Subscribe to course e-mail list
Begin assignment 1



Figure out how to install, run & use SML
Due next Thursday February 16
If you’ve never used a functional language like
ML, this might be a difficult assignment. Start
early!
onward!
What is a compiler?

A compiler is program that translates a
source language into an equivalent
target language
What is a compiler?
while (i > 3) {
a[i] = b[i];
i ++
}
C program
compiler
does this
mov eax, ebx
add eax, 1
cmp eax, 3
jcc eax, edx
assembly
program
What is a compiler?
class foo {
int bar;
...
}
Java program
compiler
does this
struct foo {
int bar;
...
}
C program
What is a compiler?
class foo {
int bar;
...
}
Java program
compiler
does this
........
.........
........
Java virtual
machine program
What is a compiler?
\newcommand{
....
}
Latex program
compiler
does this
\sfd\sf\fadg
Tex program
What is a compiler?
\newcommand{
....
}
Tex program
compiler
does this
\sfd\sf\fadg
Postscript program
What is a compiler?

Other places:




Web scripts are compiled into HTML
assembly language is compiled into
machine language
hardware description language is compiled
into a hardware circuit
...
Compilers are complex
front-end

text file to abstract syntax


middle-end
abstract syntax to
intermediate form (IR)

back-end

lexing; parsing
type checking; analysis;
optimizations;
IR to machine code

code generation; data
layout; register allocation;
more optimization
Course project
front-end



middle-end
simple imperative language
Only 1 IR (the initial abstract
syntax generated by the parser)


back-end
Fun Source Language
type checking; high-level
optimizations
Code Generation

instruction selection algorithms;
register allocation via graph coloring
Standard ML


Standard ML is a domain-specific
language for building compilers
Support for




Complex data structures (abstract syntax,
compiler intermediate forms)
Memory management like Java
Large projects with many modules
Advanced type system for error detection
Introduction to ML


You will be responsible for learning ML
on your own.
Today I will cover some basics

Resources:


Robert Harper’s Online book “an introduction to
ML” is a good place to start
See course webpage for pointers and info
about how to get the software
Preliminaries

start sml in Unix by typing sml at a
prompt:
tux% sml
Standard ML of New Jersey, Version 110.0.7, September
28, 2000 [CM; autoload enabled]
-
(* quit SML by pressing ctrl-D; ctrl-Z some times... *)
(* just so you know, comments can be (* nested *) *)
Preliminaries

Read – Eval – Print – Loop
- 3 + 2;
Preliminaries

Read – Eval – Print – Loop
- 3 + 2;
> 5: int
Preliminaries

Read – Eval – Print – Loop
- 3 + 2;
> 5: int
- it + 7;
> 12 : int
Preliminaries

Read – Eval – Print – Loop
- 3 + 2;
> 5: int
- it + 7;
> 12 : int
- it – 3;
> 9 : int
- 4 + true;
stdIn:17.1-17.9 Error: operator and operand don't
agree [literal]
operator domain: int * int
operand:
int * bool
in expression:
4 + true
Preliminaries

Read – Eval – Print – Loop
- 3 div 0;
Failure : Div
run-time error
Basic Values
- ();
> () : unit
=> like “void” in C (sort of)
=> the uninteresting value/type
- true;
> true : bool
- false;
> false : bool
- if it then 3+2 else 7;
> 7 : int
- false andalso loop_Forever;
> false : bool
“else” clause is always necessary
and also, or else short-circuit eval
Basic Values
Integers
- 3 + 2;
> 5 : int
- 3 + (if not true then 5 else 7);
> 10 : int
Strings
- “Dave” ^ “ “ ^ “Walker”;
> “Dave Walker” : string
- print “foo\n”;
foo
> 3 : int
Reals
- 3.14;
> 3.14 : real
No division between expressions
and statements
Using SML/NJ
Interactive mode is a good way to start
learning and to debug programs, but…
 Type in a series of declarations into a
“.sml” file
- use “foo.sml”
[opening foo.sml]
list of declarations
…
with their types

Larger Projects


SML has its own built in interactive
“make”
Pros:



It automatically does the dependency
analysis for you
No crazy makefile syntax to learn
Cons:

May be more difficult to interact with other
languages or tools
Compilation Manager
sources.cm
Group is
a.sig
b.sml
c.sml
a.sig
b.sml
c.sml
% sml
- OS.FileSys.chDir “~/courses/510/a2”;
- CM.make();
looks for “sources.cm”, analyzes dependencies
[compiling…]
compiles files in group
[wrote…]
saves binaries in ./CM/
- CM.make’ “myproj/”();
specify directory
What is next?

ML has a rich set of structured values







Tuples: (17, true, “stuff”)
Records: {name = “Dave”, ssn = 332177}
Lists: 3::4::5::nil or [3,4]@[5]
Datatypes
Functions
And more!
Rather than list all the details, we will
write a couple of programs
An interpreter

Interpreters are usually implemented as
a series of transformers:
lexing/
parsing
stream of
characters
(concrete
syntax)
evaluate
abstract
syntax
print
abstract
value
stream of
characters
A little language (LL)

An arithmetic expression e is





a boolean value
an if statement (if e1 then e2 else e3)
an integer
an add operation
a test for zero (isZero e)
LL abstract syntax in ML
datatype term =
Bool of bool
| If of term * term * term
| Num of int
| Add of term * term
| IsZero of term
vertical bar
separates alternatives
LL abstract syntax in ML
datatype term =
Bool of bool
| If of term * term * term
| Num of int
| Add of term * term
| IsZero of term
vertical bar
separates alternatives
This one declaration creates:
• a new type (called term)
• a new set of functions for
creating terms (Bool, If,
Num, Add, IsZero)
• a new set of patterns you
can use case statements
(like C’s “switch”) that
check what sort of term
object you have
LL abstract syntax in ML
datatype term =
Bool of bool
| If of term * term * term
| Num of int
| Add of term * term
| IsZero of term
vertical bar
separates alternatives
-- by convention,
constructors are
capitalized
-- constructors
can take a single
argument of a
particular type
type of a tuple, in this case
a triple of 3 term objects
LL abstract syntax in ML
Add
In your program, writing:
Num
Add (Num 2, Num 3)
makes an object tagged with Add
containing 2 sub-objects tagged
with Num
represents the expression “2 + 3”
2
Num
3
LL abstract syntax in ML
If
If (Bool true,
Num 0,
Add (Num 2, Num 3))
represents
Bool Num Add
true
Num Num
0
“if true then 0 else 2 + 3”
2
3
Function declarations
fun isValue (t:term) : bool =
case t of
Num n => true
| Bool b => true
| _ => false
Function declarations
function name
patterns
in pink
function parameter
t with type term
fun isValue (t:term) : bool =
case t of
Num n => true
| Bool b => true
| _ => false
default pattern matches anything
function
result type
is bool
Function declarations
fun isValue t =
case t of
Num n => true
| Bool b => true
| _ => false
ML type inference
can infer the types
of parameters and
results
A type error
fun isValue t =
case t of
Num n => n
| _ => false
ex.sml:22.3-24.15 Error: types of rules don't agree [literal]
earlier rule(s): term -> int
this rule: term -> bool
in rule:
_ => false
A type error
Sometimes, ML will give you several errors in a row:
ex.sml:22.3-25.15 Error: types of rules don't agree [literal]
earlier rule(s): term -> int
this rule: term -> bool
in rule:
_ => true
ex.sml:22.3-25.15 Error: types of rules don't agree [literal]
earlier rule(s): term -> int
this rule: term -> bool
in rule:
_ => false
A very subtle error
fun isValue t =
case t of
num => true
| _ => false
The code above type checks. But when
we test it refined the function always returns “true.”
What has gone wrong?
A very subtle error
fun isValue t =
case t of
num => true
| _ => false
The code above type checks. But when
we test it refined the function always returns “true.”
What has gone wrong?
-- num is not capitalized (and has no argument)
-- ML treats it like a variable pattern (matches anything!)
Exceptions
exception Error of string
fun debug s : unit = raise (Error s)
Exceptions
exception Error of string
fun debug s : unit = raise (Error s)
in SML interpreter:
- debug "hello";
uncaught exception Error
raised at: ex.sml:15.28-15.35
Evaluator
fun isValue t = ...
exception NoRule
fun eval t =
case t of
Bool _ | Num _ => t
| ...
Evaluator
...
let statement
fun eval t =
for remembering
case t of
temporary
Bool _ | Num _ => t
results
| If(t1,t2,t3) =>
let val v = eval t1 in
case v of
Bool b => if b then (eval t2) else (eval t3)
| _ => raise NoRule
end
Evaluator
exception NoRule
fun eval1 t =
case t of
Bool _ | Num _ => ...
| ...
| Add (t1,t2) =>
case (eval v1, eval v2) of
(Num n1, Num n2) => Num (n1 + n2)
| (_,_) => raise NoRule
Finishing the Evaluator
fun eval1 t =
case t of
...
| ...
| Add (t1,t2) => ...
| IsZero t => ...
be sure your
case is
exhaustive
Finishing the Evaluator
fun eval1 t =
case t of
...
| ...
| Add (t1,t2) => ...
What if we
forgot a case?
Finishing the Evaluator
fun eval1 t =
case t of
...
| ...
| Add (t1,t2) => ...
What if we
forgot a case?
ex.sml:25.2-35.12 Warning: match nonexhaustive
(Bool _ | Zero) => ...
If (t1,t2,t3) => ...
Add (t1,t2) => ...
Summary

All ML expressions produce values that have a
particular type



ML data types are super-cool




ML doesn’t have “statements”
ML can do type inference (and give you hard-to-decrypt
error messages
a new type name (term)
new constructors (Bool, If, Num, ...)
new patterns (Bool b, If (x,y,_), Num _, ...)
ML has a “top-level loop” to execute commands and
a compilation manager


type CM.Make() to load and compile a project
edit sources.cm to add new files
Last Things



Learning to program in SML can be
tricky at first
But once you get used to it, you will
never want to go back to imperative
languages
Check out the reference materials listed
on the course homepage
Download