CSE 305 Introduc0on to Programming Languages Lecture 13 – Type System (1) CSE @ SUNY-­‐Buffalo Zhi Yang Courtesy of Professor Benjamin C. Pierce Courtesy of Dr. Chung-­‐Chih Li Declara0ons • This lecture has majorly referred to on line material “Type Systems for Programming Languages”, by Benjamin C. Pierce. If you are interested in the official publica0on, please contact the author and Publisher MIT Press. This lecture is mainly focusing on teaching purpose and has nothing to do with any other commercial and academic ac0vi0es. No0ce Board • First, we will be pos0ng homework 6 during the weekend, and you can schedule your 0me accordingly. • Second, homework5 is due on Monday(July 15, 2013) 11:59pm. Our objec0ve • The first objec0ve of our class, is to comprehend a new programming language within very short 5me period, and because you have this ability to shorten your learning curve, you are going to manipulate the language with an insight learning. • The second objec0ve is to even engineer your own language! Review what we ve learnt and see future eg: Egyp0an Number System; Complement Number eg: Abacus Number System eg: Gate system, Including different underline device 1st Genera0on language: Machine Code eg: MIPS 2nd Genera0on language: Assembly Code eg: Fortran Regular Expression What s next ? 3rd Genera0on Language: Macro func0on Macro func5on Basic Calcula0on System Lexer Compiler System Virtual Machine Parser Push Down Automata Type Checking Context-­‐Free Grammar Lambda Calculus Theory A family tree of languages Cobol <Fortran> BASIC Algol 60 <LISP> PL/1 Simula <ML> Algol 68 C Pascal <Perl> <C++> Modula 3 Dylan Ada <Java> <C#> <Scheme> <Smalltalk> <Ruby> <Python> <Haskell> <Prolog> <JavaScript> Review Concept of Macro A macro (short for "macroinstruc0on", from Greek μακρο-­‐ 'long') in computer science is a rule or panern that specifies how a certain input sequence (open a sequence of characters) should be mapped to a replacement input sequence (also open a sequence of characters) according to a defined procedure. The mapping process that instan0ates (transforms) a macro use into a specific sequence is known as macro expansion. What is the rela0on between a macro func0on and Type System Let s look at how microsop MSDN library talks about it. Most Microsop run-­‐0me library rou0nes are compiled or assembled func0ons, but some rou0nes are implemented as macros. When a header file declares both a func0on and a macro version of a rou0ne, the macro defini0on takes precedence, because it always appears aper the func0on declara0on. When you invoke a rou0ne that is implemented as both a func0on and a macro, you can force the compiler to use the func0on version in two ways: Enclose the rou5ne name in parentheses. #include <ctype.h> //use macro version of toupper a = (toupper)(a); define a = toupper(a); //force compiler to use func5on version of toupper The other version "Undefine" the macro defini0on with the #undef direc0ve: #include <ctype.h> #undef toupper If you need to choose between a func0on and a macro implementa0on of a library rou0ne, consider the following trade-­‐offs: Speed versus size. The main benefit of using macros is faster execu5on 5me. During preprocessing, a macro is expanded (replaced by its defini5on) inline each 5me it is used. A func5on defini5on occurs only once regardless of how many 5mes it is called. Macros may increase code size but do not have the overhead associated with func5on calls. Func0on evalua0on. A func0on evaluates to an address; a macro does not. Thus you cannot use a macro name in contexts requiring a pointer. For instance, you can declare a pointer to a func0on, but not a pointer to a macro. Comparison Macro side effects. A macro may treat arguments incorrectly when the macro evaluates its arguments more than once. For instance, the touppermacro is defined as: #define toupper(c) ( (islower(c)) ? _toupper(c) : (c) ) In the following example, the toupper macro produces a side effect: #include <ctype.h> int a = 'm'; a = toupper(a++); The example code increments a when passing it to toupper. The macro evaluates the argument a++ twice, once to check case and again for the result, therefore increasing a by 2 instead of 1. As a result, the value operated on by islower differs from the value operated on by toupper. Type-­‐checking. When you declare a func0on, the compiler can check the argument types. Because you cannot declare a macro, the compiler cannot check macro argument types, although it can check the number of arguments you pass to a macro. Type System? • A useful—though rough—dis0nc0on divides the world of programming languages into two parts: • Untyped -­‐-­‐-­‐ Programs simply execute flat out; there is no anempt to check consistency of shapes • Typed -­‐-­‐-­‐ some anempt is made, either at compile 0me or at run-­‐0me, to check shape-­‐consistency. A Brief History of Type A Brief History of Type (Cont.) The earliest type system appears in Fortran. Mathema0cal Preliminaries (Set & Rela0ons) Review Untyped Lambda Basics Formali0es This is an example of an induc5ve defini5on. Since induc0ve defini0ons are ubiquitous in the study of programming languages, it is worth pausing for a moment to examine this one in detail. Here is an alterna0ve defini0on of the same set, in a more concrete style. Set Defini0on Two Defini0ons are equal ~~~T = S Explana0on Constants Size and depth of Term Rela0on between constants and Size Evalua0on Evalua0on Process Proper0es Implementa0on Implementa0on (Cont.) Summary of Type Basics (1) Summary of Type Basics (2) Opera0onal Seman0cs Programming in the Lambda-­‐Calculus Example Grammars Subs0tu0on Solve Problem Solve Problem (Cont.) Opera0onal Seman0cs Summary of Untyped Lambda Calculus Implemen0ng the Lambda-­‐Calculus Defini0on of Terms Shiping and Subs0tu0on Evalua0on A Concrete Realiza0on Evalua0on Typed Arithme0c Expressions The Typing Rela0on Proper0es of Typing and Reduc0on Type Checking !"#$%&"%'(!!!!) Safety = Preserva0on + Progress Implementa0on Summary for boolean and numbers that we shall be studying for the rest of the mily of typedguages lanthe simply typed lambda-calculus of Church [Chu40] and Curry [CF58]. course: the simply typed Simply Typed Lambda-­‐Calculus e family of typed lanse: the simply typed 7.1 Syntax (for brevity, ar: ype mmar: 7.1.1 Definition: The set of simple types over the atomic type (for brevity, we omit natural numbers) is generated by the following grammar: (types...) type of functions (types...) type typeof of booleans functions er 7 ::= (types...) type of functions type of booleans type of booleans ly of typed lan. for . e simply typed ly Typed The type constructor Chapter 7 bda-Calculus rms (with a-terms (withbooleans booleans (for brevity, is right-associative: stands for . (terms...) (terms...) variable 7.1.2 Definition: The abstract syntax of simply typed lambda-terms (with booleans variable abstraction and conditional) is defined by the following grammar: abstraction application ntroduces the most elementary member of the family of typed lan(forconstant brevity, true application we studying for the rest of the course: the simply typed : shall be constant false ::= [CF58]. (terms...) constant trueand Curry lus of Church [Chu40] conditional constant false (types...) variable Chapter 7 conditional type of functions tax Simply Typed Chapter 77 Chapter Lambda-Calculus Simply Typed Typed Simply Chapter 7 Simply Typed type of booleans Lambda-Calculus Lambda-Calculus his chapter introduces the most elementary member of the family of typed lanLambda-Calculus uages that .we shall be studying for the rest of the course: the simply typed mbda-calculus of Church [Chu40] and Curry [CF58]. Simply Typed ms (with booleans .1 Syntax Lambda-Calculus er of the family of typed lanhe course: the simply typed ]. on: The set of simple types over the atomic type ral numbers) is langenerated by the following grammar: he family of typed urse: the simply typed (for brevity, This chapter introduces the most elementary member of the family of typed languages that we shall be studying for the rest of the course: the simply typed lambda-calculus of Church [Chu40] and Curry [CF58]. abstraction application constant true constant false conditional (types...) type of functions 7.1 Syntax This the most elementary member ofofthe (types...) Thischapter chapterintroduces introduces the most elementary member thefamily familyofoftyped typedlanlanofsimple booleans 7.1.1 Definition:type The set of types over the atomic type (for brevity, type of functions we omit natural numbers) is generated by the following grammar: guages be studying for the guages that that we weshall shall be studying for therest restofofthe thecourse: course: the thesimply simplytyped typed type of booleans ::= (types...) is right-associative: stands . and structor type of functions lambda-calculus ofoffor Church lambda-calculus Church[Chu40] [Chu40] andCurry Curry[CF58]. [CF58]. stands for . type (for brevity, type of booleans 1.1 Definition: The set of simple types over the atomic type (for brevity, (terms...) ammar: is right-associative: stands for . The type constructor (with booleans er:lambda-terms omit numbers) generated the following grammar: on: Thenatural abstract syntax of is simply typedbylambda-terms booleans 7.1.2 Definition:(with The abstract syntax of simply typed lambda-terms (with booleans variable (types...) and conditional) is defined by the following grammar: al) is defined by the following grammar: (terms...) ::= (terms...) type of functions 7.1 Syntax 7.1 Syntax ::= (types...) variable This chapter introduces the most elementary member of the family of typed lanabstraction variable type of booleans abstraction abstraction application (terms...) type ofthat functions guages we shall be studying for the rest of the course: the simply typed constant true application application s for . 7.1.1 types the atomic (for 7.1.1Definition: Definition: The Theset setofofsimple simple types over the atomictype type (forbrevity, brevity, constant variable typeover of booleans constant true lambda-calculus offalseChurch [Chu40] and Curry [CF58]. conditional constant false we weomit omitnatural naturalnumbers) numbers)isisgenerated generatedby bythe thefollowing followinggrammar: grammar: constant true da-terms (with booleans tomic type ing grammar: (for brevity, Conven0on • In the literature on type systems, two different presenta0on styles are commonly used: • In implicitly typed (or, for historical reasons, Curry-­‐style) systems, the pure (untyped) lambda-­‐calculus is used as the term language. The typing rules define a rela0on between untyped terms and the types that classify them. • In explicitly typed (or Church-­‐style) systems, the term language itself is refined so that terms carry some type informa0on within them; for example, the bound variables in func0on abstrac0ons are always annotated with the type of the expected parameter. The type system relates typed terms and their types. • To a large degree, the choice is a maner of taste, though explicitly typed systems generally pose fewer algorithmic problems for typecheckers. We will adopt an explicitly typed presenta0on throughout. The Typing Rela0on (1) 7.2 The Typing Relation In order to assign a type to an abstraction like , we need to know what will happen later when it is applied to some argument . The annotation on the bound variable tells us that we may assume that the argument will be of type . In other words, the type of the result will be just the type of , where occurrences of in are assumed to denote terms of type . This intuition is captured by the following rule: (T-A BS) Since, in general, function abstractions can be nested, typing assertions actually , pronounced “term has type under the assumptions have the form about the types of its free variables.” Formally, the typing context is just a list of variables and their types, and the “comma” operator extends by concatenating a new binding on the right. To avoid confusion between the new binding and any bindings that may already appear in , we require that the name be chosen so that it does not already appear in dom . (As usual, this condition can always be satisfied by renaming the bound variable if necessary.) can thus be thought of as a finite function from variables to their types. Following this intuition, we will for the set of variables bound by and for the type associated write dom with in . So the rule for typing abstractions actually has the general form (T-A BS) where the premise adds one more assumption to those in the conclusion. January 15, 2000 7. S IMPLY T YPED L AMBDA -C ALCULUS The Typing Rela0on (2) 60 The typing rule for variables follows immediately from this discussion. A variable has whatever type we are currently assuming it to have: (T-VAR) Next, we need a rule for application: (T-A PP) evaluates to a function mapping arguments in to results in In English: If (under the assumption that the terms represented by its free variables yield results evaluates to a result in , then the of the types associated to them by ), and if to will be a value of type . result of applying We have now given typing rules for each of the individual constructs in our simple language. To assign types to whole programs, we combine these rules into derivation trees. For example, here is a derivation tree showing that the term has type in the empty context: T-VAR T-A BS T-T RUE T-A PP 7.2.1 Exercise [Quick check]: Show (by exhibiting derivation trees) that the fol- typed lambda-calculus with many other base types, in addition to or instead of booleans. We therefore split the formal summary of the system into two pieces: the , with no base types at all, and a separate pure simply typed lambda calculus extension with booleans. Summary for Simply Typed Lambda-­‐Calculus : Simply typed lambda-calculus (typed) Syntax ::= (terms...) variable abstraction application ::= (values...) abstraction value ::= (types...) type of functions ::= (contexts...) empty context term variable binding Evaluation ) ( (E-B ETA) (E-A PP 1) (E-A PP 2) Typing ( ) (T-VAR ) January 15, 2000 7. S IMPLY T YPED L AMBDA -C ALCULUS (T-A BS 62) (T-A PP ) The highlighted areas here are used to mark material that is new with respect to the untyped lambda-calculus—whole new rules as well as new bits that need to be !"#$%&'$()*+,-./0,)$*,$ Proper0es of Typing and Reduc0on (1) 1+,2+3445)2$63)2.3278$$ ($$ $5/$0%$ !"#$%&'$()*+,-./0, 67/*.+7$9$:$!;<$3)-$1.8=-,>)$ #.(-*/+(-0+-&* $$367$&()3"/($ 1+,2+3445)2$63)2. ?.*,43*3$$ 6$8%"#$2(3#6569$ "230($0-($ 67/*.+7$9$:$!;<$3)-$1. 69=$$$$ !"#$@$"ABCDE.F3G,$$ ?.*,43*3$$ 6$&(-.(&&%*3"0%* H=5$C3)2$$ !,.+*78I$,J$$1+,J788,+$C3/,K$L7GDM+$ !"#$%&'$()*+,-./ !,.+*78I$,J$N+O$N3K5-$P77-$$ !"#$@$"ABCDE.F3G,$$ $ 1+,2+3445)2$63) H=5$C3)2$$ !,.+*78I$,J$$1+,J788,+$C3/,K$ 67/*.+7$9$:$!;<$3)-$ !,.+*78I$,J$N+O$N3K5-$P77 $ ?.*,43*3 tween the term and the derivation. (For many of the type systems that we will see, this simple correspondence will not hold: there will be significant work involved in showing that typing derivations can be recovered effectively from typed terms.) 7.4.3 Theorem [Uniqueness of Types]: In a given typing context , a term (with free variables all in the domain of ) has at most one type. That is, if a term is typeable, then its type is unique. Moreover, there is just one derivation of this typing built from the inference rules that generate the typing relation. The proof of the uniqueness theorem is so direct that there is almost nothing to say. We present a few cases carefully just to illustrate the structure of proofs by induction on typing derivations. January 15, 2000 7. S IMPLY T YPED L AMBDA -C ALCULUS and . We show, by induction on a derivaProof: Suppose that In Section 7.1, we chose to use an explicitly typed presentation of th , that . tion of partly in order to simplify the algorithmic issues involved in typechec involved adding type annotations to bound variables in function abstra Case T-VAR: nowhere else. In what sense is this “enough”? with One answer is provided by the “uniqueness of types” theorem, the of which is that well-typed terms are in one-to-one correspondence with By case (1) of the inversion lemma (7.4.1), the finalderivations rule in that anyjustify derivation of their well-typedness (in a given environment). T derivation can be recovered immediately from the term, and vice versa. I must also be T-VAR , and . correspondence is so straightforward that, in a sense, there is little diff Case T-A BS: tween the term and the derivation. (For many of the type systems that w this simple correspondence will not hold: there will be significant work in showing that typing derivations can be recovered effectively from typ 7.4.3 Theorem [Uniqueness of Types]: In a given typing context , a ter By case (2) of the inversion lemma, the final rulefreeinvariables any derivation ofof ) has at most one type. That is, if all in the domain typeable, then its type is unique. Moreover, there is just one derivati must also be T-A BS, and this derivation must have a subderivation with conclutyping built from the inference rules that generate the typing relation. , with . By the induction hypothesis (on the sion The proof of the uniqueness theorem is so direct that there is almo ), we obtain , from which subderivation with conclusion ( to say. We present a few cases carefully just to illustrate the structure of induction on typing derivations. is immediate. and . We show, by induction on Proof: Suppose that Case T-A PP, T-T RUE , T-FALSE , T-I F: , that . tion of Similar. Case T-VAR: with !"#$%&'$()*+,-./0,)$*,$ 1+,2+3445)2$63)2.3278$$ 67/*.+7$9$:$!;<$3)-$1.8=-,>)$ 7.4.1 Lemma [Inversion of the typing relation]: ?.*,43*3$$ As in Chapter 6, we need to develop a few basic lemmas before we can prove type soundness. Most of these are similar to what we saw before (just adding Proper0es of Typing and Reduc0on (1) contexts to the typing relation and adding appropriate clauses for -terms. The only significant new requirement is a substitution principle for the typing relation (LemmaTypechecking 7.4.9). 1. If 2. If 3. 4. 5. , then . , then and !"#$@$"ABCDE.F3G,$$ , then there is some type If H=5$C3)2$$ . !,.+*78I$,J$$1+,J788,+$C3/,K$L7GDM+$ , then . If !,.+*78I$,J$N+O$N3K5-$P77-$$ , then . If $ 6. If . such that , then Proof: Immediate from the definition of the typing relation. and . Proof: Straightforward induction on typing derivations. . Proof: Stra Typing and Subs0tu0on 7.4.10 Exercise [Recommended]: Prove the substitution lemma, using an induc7.4.10 Exercise [Recommended]: Prove the substitution lemma, using an inducTyping an tion on the depth of typing derivations and Lemma 7.4.1. The full proof appears onSubstitution the depth of typing derivations and Lemma 7.4.1. The full proof appears Typingintion and the solutions; try to write it out yourself before having a look at the answer. 7.4.9 Lemm in the solutions; try to write it out yourself before having a look at the answer. (Solution on page 255.)If . 7.4.9 Lemma [Substitution]: and , then (Solution on page 255.) . 7.4.10 Exerc Type Soundness Type [Recommended]: Soundness 7.4.10 Exercise Prove the substitution lemma, using an induc- tion on the in the solut 7.4.11 Theorem [Preservation of types during evaluation]: If proof appears and tion on the depth of typing derivations and Lemma 7.4.1. The full 7.4.11 Theorem [Preservation of types during evaluation]: If and , then try to . write it out yourself before having a look at the answer. (Solution on in the solutions; , then . (SolutionProof: on page 255.) Proof: Type Sou 7.4.12 Theorem [Progress]: Suppose is closed and stuck. If , then is a 7.4.12 Theorem [Progress]: Suppose is closed and stuck. If , then 7.4.11 is a Theo Type Soundness value. value. , then Proof: Outline: first show that every closed value of type isand either or 7.4.11 Theorem [Preservation of types during evaluation]: If Proof:and Outline: first show that every closed value of type is either or Proof: is a abstraction. , then . every closed value of type and every closed value of type is a abstraction. 7.4.12 Theo Proof: value. !"#$%&'()*+($$ • ,-($.#/0$%&'()*+($%1$%"#$)23//4$5/$0%$ !"#$%&'&()*+*(&,*$%"-%+##.(-*/+(-0+-&* !"#$%&'()*+($$ ,.1'.(*2&%3*4'"%1*5#&*$&%.")4$$367$&()3"/($ • ,-($.#/0$%&'()*+($%1$%"#$)23//4$5/$0%$ 8%"$-3+($0-5/$3&52508$0%$/-%#0(6$8%"#$2(3#6569$ !"#$%&'&()*+*(&,*$%"-%+##.(-*/+(-0+-&* 7.4.12 Theorem [Progress]: Suppose is closed and stuck. If , then is a Proof: Outl )"#+(4$8%"$3#($9%569$0%$:365;"230($0-($ value. and e ,.1'.(*2&%3*4'"%1*5#&*$&%.")4$$367$&()3"/($ 2369"39($<50-$36$56/59-0$2(3#6569=$$$$ Proof: Outline: first show that every closed value of type is either or 8%"$-3+($0-5/$3&52508$0%$/-%#0(6$8%"#$2(3#6569$ and every closed value of type is a abstraction. )"#+(4$8%"$3#($9%569$0%$:365;"230($0-($ • ,-($/()%67$%&'()*+($5/$0%$(+(6$&(-.(&&%*3"0%* 2369"39($<50-$36$56/59-0$2(3#6569=$$$$ nuary 15, 2000 7. S IMPLY T YPED L AMBDA -C ALCULUS .5 Implementation 65 Implementa0on 8.1 Base Types Base types Extensions (1) New syntactic forms ::= ... January 15, 2000 Base types 8. E XTENSIONS (types...) 68 base type New syntactic forms unit type (types...) base type ::= ... 8.2 Unit type New typing rules ) ( (T-U NIT ) Unit type 8.2 Unit type New syntactic forms Note: C and Java’s ::= ... Unit 8.3 type Let ::= ... type is a misnomer – what they really mean is (terms...) constant bindings (types...) New syntactic forms ::= ... Let binding (terms...) constant 67 New syntactic forms ::= ... ::= ... New evaluation rules (types...) (terms...) let binding ) ( 67 (E-L ET B ETA) (E-L ET) New typing rules ( ) (T-L ET) 8.3.1 Exercise [Recommended]: Add bindings to the “ ” typechecker January 15, 2000 8. E XTENSIONS Extensions (2) 69 Records and tuples New syntactic forms ::= ... (terms...) record projection ::= ... (values...) record value ::= ... (types...) type of records New evaluation rules ) ( (E-R CD B ETA ) (E-P ROJ ) (E-R ECORD ) New typing rules ( ) for each (T-R CD ) (T-P ROJ ) New abbreviations def def 8.4.1 Exercise: In this presentation of records, the projection operation is used to extract the fields of a record one by one. Many high-level programming languages Extensions (3) January 15, 2000 8. E XTENSIONS 71 pat (untyped) Record patterns New syntactic forms ::= variable pattern record pattern ::= ... (terms...) pattern binding Matching rules: match for each (M-VAR) match (M-R CD) match New evaluation rules ( ) match (E-L ET B ETA) (E-L ET) New abbreviations def Your job is to add types to this calculus, in the style of the simply typed lambda- 8.5 Variants Extensions (4) Variants New syntactic forms ::= ... (terms...) tagging case ::= ... (types...) type of variants ) New evaluation rules ( (E-C ASE B ETA) (E-C ASE) (E-TAG) New typing rules ( ) for each for each (T-VARIANT) (T-C ASE) January 15, 2000 8. E XTENSIONS Extensions (5) 73 General recursion New syntactic forms ::= ... New typing rules ( (terms...) fixed point of ) (T-F IX ) New abbreviations def A corollary of the definability of fixed-point combinators at every type is that every type in this system is inhabited. For example, for each type , the application has type , where is defined like this: January 15, 2000 January 15, 2000 8. E XTENSIONS 8. E XTENSIONS Extensions (6) 8.7 Lists 74 (E-C ONS 1) (E-C ONS 1) Lists (E-C ONS 2) (E-C ONS 2) New syntactic forms ::= ... (E-N ULL B ETAT) (E-N ULL B ETAT) (E-N ULL B ETA F) (E-N ULL B ETA F) (terms...) empty list list constructor test for empty list head of a list tail of a list ::= ... (E-N ULL) (E-N ULL) (E-H EAD B ETA) (E-H EAD B ETA) (values...) empty list list constructor ::= ... New evaluation rules ( 74 (E-H EAD) (E-H EAD) (E-TAIL B ETA) (E-TAIL B ETA) (types...) type of lists ) ) New typing rules ( New typing rules ( (E-TAIL) (E-TAIL) ) (T-N IL) (T-N IL) (T-C ONS) (T-C ONS) (T-N ULL) (T-N ULL) (T-H EAD) (T-H EAD) (T-TAIL) (T-TAIL) Extensions (7) 8.8 Lazy records and let-bindings Lazy let bindings New syntactic forms ::= ... (terms...) lazy let binding ) New evaluation rules ( (E-LL ET B ETA) New typing rules ( ) (T-LL ET) Lazy records New syntactic forms ::= ... (terms...) lazy record ::= ... (values...) lazy record value ) New evaluation rules ( (E-LR CD B ETA) New typing rules ( ) (T-LR CD ) Exercise: write down rules for lazy functions (These are used only in the advanced parts of the object examples.) ML • • • • • Download SML here Robin Milner 1934-­‐2010 Turing Award 1991 Meta Language One of the most popular functional languages Edinburgh, 1974, Robin Milner s group with Lockwood Morris There are a number of dialects We are using Standard ML (SML) but we will just call it ML from the time being SML/NJ Standard ML of New Jersey (abbreviated SML/NJ) is a compiler for the Standard ML '97 programming language with associated libraries, tools, and documenta0on. SML/NJ is free, open source sopware. hnp://www.smlnj.org/dist/working/index.html You are encourage to install SML/NJ on your computer, but we are going to use the one installed in our Unix server Timberlake, and all your work will be tested there. But you can write your code on any platform, just let me know what it is . Standard ML of New Jersey -1+2*3; val it = 7 : int -1+2*3 = means that ML expects more input = it needs ; to end the expression ; = val it = 7 : int - prompt Expression (; is not part of the expression.) ML replies with value and type Variable it is a special variable bound to the value of the expression just typed ML Basic (the building blocks) • • • • • • Constants Operators Defining Variables Tuples and Lists Defining Functions ML Types and Type Annotations Mottoes of ML • Type is the backbone of ML • Recursion is the blood of ML • Function is the flesh of ML • Higher order is the soul of ML - 1234; val it = 1234 : int - 123.4; val it = 123.4 : real Integer constants: standard integers , Real constants: standard decimal notation int, real are the names of the types - true; val it = true : bool - false; val it = false : bool Boolean constants true and false ML is case-sensitive: use true, not True or TRUE Type name: bool - "fred"; val it = "fred" : string - "H"; val it = "H" : string - #"H"; val it = #"H" : char String constants: text inside double quotes Can use C-style escapes: \n, \t, \\, \", etc. Character constants: put # before a 1-character string Type names: string and char Operators - ~ 1 + 2 - 3 * 4 div 5 mod 6; val it = ~1 : int Standard operators for integers ~ (tilde) for unary negation, e.g., ~1, negative 1 - for binary subtraction - ~ 1.0 + 2.0 - 3.0 * 4.0 / 5.0; val it = ~1.4 : real - Same operators for reals, (use / for real division) Left associative, precedence is {+,-} < {*,/,div, mod} < {~}. - "bibity" ^ "bobity" ^ "boo"; val it = "bibitybobityboo" : string - 2 < 3; val it = true : bool - 1.0 <= 1.0; val it = true : bool - #"d" > #"c"; val it = true : bool - "abce" >= "abd"; val it = false : bool String concatenation: ^ operator Ordering comparisons: <, >, <=, >=, apply to string, char, int and real Order on strings and characters is lexicographic - 1 = 2; val it = false : bool - true <> false; val it = true : bool - 1.3 = 1.3; Error: operator and operand don't agree [equality type required] operator domain: ''Z * ''Z operand: real * real in expression: 1.3 = 1.3 Equality comparisons: = and <> Most types are equality testable: these are equality types Type real is not an equality type - 1 val - 1 val < 2 orelse 3 > 4; it = true : bool < 2 andalso not (3 < 4); it = false : bool Boolean operators: andalso, orelse, not. And we can also use = for equivalence <> for exclusive or. Precedence:: ~ not * / div + - ^ = <> Andalso orelse < mod > <= >= - true orelse 1 div 0 = 0; val it = true : bool Note: andalso and orelse are short-circuiting operators. E.g., if the first operand of orelse is true, the second is not evaluated; likewise if the first operand of andalso is false Technically, they are not ML operators, but keywords Because all true ML operators evaluate all operands - if 1 < 2 then #"x" else #"y ; val it = #"x" : char -if 1 > 2 then 34 else 56; val it = 56 : int - (if 1 < 2 then 34 else 56) + 1; val it = 35 : int Conditional expression (not statement) using if … then … else … Similar to C's ternary operator: (1<2) ? 'x' : 'y' Value of the expression is the value of the then part, if the test part is true, or the value of the else part otherwise There is no if … then construct What is the value and ML type for each of the following expressions? 1 * 2 "abc" if (1 1 < 2 Practice + 3 * 4 ^ "def" < 2) then 3.0 else 4.0 orelse (1 div 0) = 0 10 / 5 #"a" = #"b" or 1 = 2 What is wrong? 1.0 = 1.0 if (1<2) then 3 else 4.0 If (1<2) then 3 - 1 * 2; val it = 2 : int - 1.0 * 2.0; val it = 2.0 : real - 1.0 * 2; Error: operator and operand don't agree [literal] operator domain: real * real operand: real * int in expression: 1.0 * 2 The * operator, and others like + and <, are overloaded to have one meaning on pairs of integers, and another on pairs of reals ML does not perform implicit type conversion - real(123); val it = 123.0 : real - floor(3.6); val it = 3 : int - floor 3.6; val it = 3 : int - str #"a"; val it = "a" : string Some Built-in conversion functions: 1. real (fn: int à real) 2. floor (fn: real à int) 3. ceil (fn: real à int) 4. round (fn: real à int) 5. trunc (fn: real à int) 6. ord (fn: char à int) 7. chr (fn: int à char) 8. str (fn: char à string) Function Associativity • Function application is left-associative • So f a b means (f a) b, which means: – first apply f to the single argument a; – then take the value f returns, which should be another function; – then apply that function to b f g h i j ((((f g) h) i) j) Note: This is different from function composition: f(g(h(i(j)))) - square val it = - square val it = 2+1; 5 : int (2+1); 9 : int Function application has higher precedence than any operator square 2+1; (square 2) + 1; Practice What if anything is wrong with each of the following expressions? trunc (fn: real à int) ord (fn: char à int) chr trunc 5 ord "a" str if 0 then 1 else 2 if true then 1 else 2.0 chr(trunc(97.0)) chr(trunc 97.0) chr trunc 97.0 (fn: int à char) (fn: char à string) Defining Variables - val x = 1+2*3; - val x = 7 : int - x; val it = 7 : int - val y = if x = 7 then 1.0 else 2.0; val y = 1.0 : real Val defines a new variable and bind it to a value. Variable names should consist of a letter, followed by zero or more letters, digits, and/or underscores. - val fred = 23; val fred = 23 : int - fred; val it = 23 : int - val fred = true; val fred = true : bool - fred; val it = true : bool - 1+2+3; - Same as type in: - val it = 1+2+3; • You can define a new variable with the same name as an old one, even using a different type. • This is not the assignment. It defines a new variable but does not change the old one; it s a declaration. • This example is not particularly useful. • Any part of the program that was using the first definition of fred, still is after the second definition is made. Practice Suppose we make these ML declarations in the following order: val val val val a b c a = = = = "123"; "456"; a ^ b ^ "789"; 3 + 4; Then, what is the values and types of a, b, and c after the last expression above? a b c 7 "456" "123456789" Not an assignment Suppose we make these ML declarations: - val a = 3; - fun f x = x + a; This is a way we define a function. (flesh) - val a = 4; - fun g x = x + a; Then, what are the following results? - g 2; - f 2; 6 5 Garbage Collection • Sometimes, for no apparent reason, we see GC #0.0.0.0.1.3: (0 ms) • A garbage collection has been performed. Tuples and Lists Two most important structures in ML Tuples : (1,2,3) Lists: [1, 2, 3] Like a struct Like an array Tuples 1. A tuple is like a record (struct) with no field names 2. Use parentheses to form tuples 3. Tuples can contain other tuples 4. To get ith element of a tuple x, use #i x - val barney = (1+2, 3.0*4.0, "brown"); val barney = (3,12.0,"brown") : int * real * string - val point1 = ("red", (300,200)); val point1 = ("red",(300,200)) : string * (int * int) - #2 barney; val it = 12.0 : real - #1 (#2 point1); val it = 300 : int (1) is not a tuple of one - (1, 2); val it = (1,2) : int * int - (1); val it = 1 : int - #1 (1, 2); val it = 1 : int - #1 (1); Error: operator and operand don't agree [literal] operator domain: {1:'Y; 'Z} operand: int in expression: (fn {1=1,...} => 1) 1 Tuple Type Constructor • ML gives the type of a tuple using * as a type constructor • For example, int * bool is the type of pairs (x,y) where x is an int and y is a bool • Note that parentheses have structural significance: int * (int * bool) (int * int) * bool int * int * bool Cartesian Products: A= {a, b} B={1, 2, 3} A × B = {(a,1), (a,2), (a,3), (b,1), (b,2), (b,3)} Lists Use square brackets to make lists Unlike tuples, all elements of a list must be of the same type - [1,2,3]; val it = [1,2,3] : int list - [1.0,2.0]; val it = [1.0,2.0] : real list - [true]; val it = [true] : bool list - [(1,2),(1,3)]; val it = [(1,2),(1,3)] : (int * int) list - [[1,2,3],[1,2]]; val it = [[1,2,3],[1,2]] : int list list Elements in a list must have the same type - []; val it = [] : 'a list - nil; val it = [] : 'a list Empty list is [] or nil Note the unknown type of the empty list: 'a list Any variable name beginning with an apostrophe is a type variable; it stands for a type that is unknown 'a list means a list of elements with type unknown The null test - null val it - null val it []; = true : bool [1,2,3]; = false : bool • null tests whether a given list is empty • You could also use an equality test, as in x = [] • However, null x is preferred; (we will see why in a moment) List Type Constructor • ML gives the type of lists using list as a type constructor • For example, int list is the type of lists of type int • A list is not a tuple - [1,2,3]@[4,5,6]; val it = [1,2,3,4,5,6] : int list The @ operator concatenates lists Operands are two lists of the same type Equivalence, [1]@[2,3,4] Note: 1@[2,3,4] is wrong or 1::[2,3,4] - val val x - val val y - val val z x = y = z = = #"c"::[]; [#"c"] : char list = #"b"::x; [#"b",#"c"] : char list = #"a"::y; [#"a",#"b",#"c"] : char list (cons) Lisp List-builder operator is :: It puts the new element on the front of the old list, the types must be agreed. - val z = 1::2::3::[]; val z = [1,2,3] : int list - hd z; val it = 1 : int - tl z; val it = [2,3] : int list -tl(tl z); val it = [3] : int list -tl(tl(tl z)); val it = [] : int list The :: operator is right-associative hd : the first element tl : the tail; the whole list after the first element Practice What are the values of the following expressions? • #2(3,4,5) • hd(1::2::nil) • hd(tl(#2([1,2],[3,4]))) What is wrong with the following expressions? • 1@2 • hd(tl(tl [1,2])) • [1]::[2,3] explode : converts a string to a list of characters implode : converts a list of characters to a string - explode "hello"; val it = [#"h",#"e",#"l",#"l",#"o"] : char list - implode [#"h",#"i"]; val it = "hi" : string - explode; val it = fn : string -> char list - implode; val it = fn : char list -> string - hd; val it = fn : 'a list -> 'a - tl; val it = fn : 'a list -> 'a list - Moco • • • • Type is the backbone of ML Recursion is the blood of ML Function is the flesh of ML Higher order is the soul of ML • ML is constructed based on types (backbone) • ML carries information by recursion (blood) • ML s outlook is made by function (flesh) • ML exists because of its higher order computation (soul) Defining Func0ons fun defines a new func0on and binds it to a variable. Func0on is a type, thus fun is a type constructor. Now, suppose we want to define a func5on that will take a string and return its first character. - fun firstChar s = hd (explode s); val firstChar = fn : string -> char - firstChar "abc"; val it = #"a" : char It is rarely necessary to declare any types, since ML infers them. ML can tell that s must be a string, since we used explode on it, and it can tell that the function result must be a char, since it is the hd of a char list Function Definition Syntax in BNF <fun-def> ::= fun <function-name> <parameter> = <expression> ; • <function-name> can be any legal ML name • The simplest <parameter> is just a single variable name: the formal parameter of the function • The <expression> is any ML expression; its value is the value the function returns • This is a subset of ML function definition syntax; Func0on Type Constructor E.g., int -> real is the type of a function that takes an int parameter, the domain type) and produces a real result, the range (co-domain) type All ML functions take exactly one parameter To pass more than one thing, you can pass a tuple - fun quot(a,b) = a div b; val quot = fn : int * int -> int - quot (6,2); val it = 3 : int - val pair = (6,2); val pair = (6,2) : int * int - quot pair; val it = 3 : int Motto: • Type is the backbone of ML • Recursion is the blood of ML • Function is the flesh of ML • Higher order is the soul of ML Recursive factorial function - fun fact n = = if n = 0 then 1 = else n * fact(n-1); val fact = fn : int -> int - fact 5; val it = 120 : int Recursive func0on to add up the elements of an int list - fun listsum x = = if null x then 0 = else hd x + listsum(tl x); val listsum = fn : int list -> int - listsum [1,2,3,4,5]; val it = 15 : int This function has used a common pattern: 1. base case for null x, 2. recursive call on tl x Recursive func0on to compute the length of a list (This is predefined in ML) - fun length x = = if null x then 0 = else 1 + length (tl x); val length = fn : 'a list -> int - length [true,false,true]; val it = 3 : int - length [4.0,3.0,2.0,1.0]; val it = 4 : int Note type: this works on any type of list. It is polymorphic. - fun badlength x = = if x=[] then 0 = else 1 + badlength (tl x); val badlength = fn : ''a list -> int - badlength [true,false,true]; val it = 3 : int - badlength [4.0,3.0,2.0,1.0]; Error: operator and operand don't agree [equality type required] two apostrophes, like ''a, are restricted to equality types, because we compared x for equality with the empty list. That s why null x is better than x=[]. It avoids unnecessary type restrictions. Recursive function to reverse a list - fun reverse L = = if null L then nil = else reverse(tl L) @ [hd L]; val reverse = fn : 'a list -> 'a list - reverse [1,2,3]; val it = [3,2,1] : int list ML Types and Type Annotations • Primitive types: int, real, bool, char, and string • Three type constructors: – Tuple types using * – List types using list – Function types using -> Constructor s precedence • When combining constructors, list >> * >> int * bool list int * bool list -> real ≅ ≅ -> int * (bool list) (int * (bool list)) -> real • Use parentheses as necessary for clarity - fun prod(a,b) = a * b; val prod = fn : int * int -> int Why int, rather than real? ML s default type for * (and +, and –) is int * int -> int You can give an explicit type annotation to get real instead… - fun prod(a:real,b:real):real = a*b; val prod = fn : real * real -> real Type annotation is a colon followed by a type Can appear after any variable or expression These are all equivalent: fun fun fun fun fun fun fun prod(a,b):real = a * b; prod(a:real,b) = a * b; prod(a,b:real) = a * b; prod(a,b) = (a:real) * b; prod(a,b) = a * b:real; prod(a,b) = (a*b):real; prod((a,b):real * real) = a*b;