CS5205: Foundation in Programming Languages Lecture 1 : Overview “Language Foundation, Extensions and Reasoning Lecturer : Chin Wei Ngan Email : chinwn@comp.nus.edu.sg Office : S15 06-01 CS5205 Introduction 1 Course Objectives - graduate-level course with foundation focus - languages as tool for programming - foundations for reasoning about programs - explore various language innovations CS5205 Introduction 2 Course Outline • Lecture Topics (13 weeks) • Lambda Calculus • Advanced Language (Haskell) http://www.haskell.org • Type System for Lightweight Analysis http://www.cs.cmu.edu/~rwh/plbook/book.pdf • Semantics • Formal Reasoning – Separation Logic + Provers • Language Innovations (paper readings) CS5205 Introduction 3 Administrative Matters - mainly IVLE - Reading Materials (mostly online): www.haskell.org Robert Harper : Foundations of Practical Programming Languages. Free PL books : http://www.cs.uu.nl/~franka/ref - Lectures + Assignments + Presentation + Exam - Assignment/Quiz (30%) - Paper Reading + Critique (25%) - Exam (45%) CS5205 Introduction 4 Paper Critique • • • • • • • Focus on Language Innovations Select Topic (Week 3) 2-Page Summary (Week 5) Prepare Presentation (Week 7) Oral Presentation (Last 4 weeks) Paper Critique (Week 13) Possible Topics • • • • • • CS5205 Concurrent and MultiCore Programming Software Transaction Memory GUI Programming with Array Testing with QuickCheck IDE for Haskell SELinks (OCaml) Introduction 5 Advanced Language - Haskell • • • • • • • Strongly-typed with polymorphism Higher-order functions Pure Lazy Language. Algebraic data types + records Exceptions Type classes, Monads, Arrows, etc Advantages : concise, abstract, reuse • Why use Haskell ? CS5205 cool & greater productivity Introduction 6 Example - Haskell Program • Apply a function to every element of a list. a type variable data List a = Nil | Cons a (List a) type is : (a b) (List a) (List b) map f Nil = Nil map f (Cons x xs) = Cons (f x) (map f xs) map inc (Cons 1 (Cons 2 (Cons 3 Nil))) ==> (Cons 2 (Cons 3 (Cons 4 Nil))) CS5205 Introduction 7 Some Applications • Hoogle – search in code library • Darcs – distributed version control • Programming user interface with arrows.. • How to program multicore? • map/reduce and Cloud computing • Some issues to be studied in Paper Reading. CS5205 Introduction 8 Type System – Lightweight Analysis • Abstract description of code + genericity • Compile-time analysis that is tractable • Guarantees absence of some bad behaviors • Issues – expressivity, soundness, completeness, inference? • How to use, design and prove type system. • Why? CS5205 detect bugs Introduction 9 Specification with Separation Logic • • • • Is sorting algorithm correct? Any memory leaks? Any null pointer dereference? Any array bound violation? • What is the your specification/contract? • How to verify program correctness? • Issues – mutation and aliasing • Why? CS5205 sw reliability Introduction 10 Lambda Calculus • • • • • Untyped Lambda Calculus Evaluation Strategy Techniques - encoding, extensions, recursion Operational Semantics Explicit Typing Introduction to Lambda Calculus: http://www.inf.fu-berlin.de/lehre/WS03/alpi/lambda.pdf http://www.cs.chalmers.se/Cs/Research/Logic/TypesSS05/Extra/geuvers.pdf Lambda Calculator : http://ozark.hendrix.edu/~burch/proj/lambda/download.html CS5205 Introduction 11 Untyped Lambda Calculus • Extremely simple programming language which captures core aspects of computation and yet allows programs to be treated as mathematical objects. • Focused on functions and applications. • Invented by Alonzo (1936,1941), used in programming (Lisp) by John McCarthy (1959). CS5205 Introduction 12 Functions without Names Usually functions are given a name (e.g. in language C): int plusOne(int x) { return x+1; } …plusOne(5)… However, function names can also be dropped: (int (int x) { return x+1;} ) (5) Notation used in untyped lambda calculus: (l x . x+1) (5) CS5205 Introduction 13 Syntax In purest form (no constraints, no built-in operations), the lambda calculus has the following syntax. t ::= terms variable abstraction application x lx.t tt This is simplest universal programming language! CS5205 Introduction 14 Conventions • Parentheses are used to avoid ambiguities. e.g. x y z can be either (x y) z or x (y z) • Two conventions for avoiding too many parentheses: • Applications associates to the left e.g. x y z stands for (x y) z • Bodies of lambdas extend as far as possible. e.g. l x. l y. x y x stands for l x. (l y. ((x y) x)). • Nested lambdas may be collapsed together. e.g. l x. l y. x y x can be written as l x y. x y x CS5205 Introduction 15 Scope • An occurrence of variable x is said to be bound when it occurs in the body t of an abstraction l x . t • An occurrence of x is free if it appears in a position where it is not bound by an enclosing abstraction of x. • Examples: x y ly. x y l x. x (l x. x x) (l x. x x) (l x. x) y (l x. x) x CS5205 Introduction (identity function) (non-stop loop) 16 Alpha Renaming • Lambda expressions are equivalent up to bound variable renaming. e.g. l x. x =a l y. y l y. x y =a l z. x z But NOT: l y. x y =a l y. z y • Alpha renaming rule: lx.E CS5205 =a l z . [x a z] E Introduction (z is not free in E) 17 Beta Reduction • An application whose LHS is an abstraction, evaluates to the body of the abstraction with parameter substitution. e.g. (l x. x y) z !b z y (l x. y) z !b (l x. x x) (l x. x x) !b y (l x. x x) (l x. x x) • Beta reduction rule (operational semantics): ( l x . t1 ) t2 !b [x a t2] t1 Expression of form ( l x . t1 ) t2 is called a redex (reducible expression). CS5205 Introduction 18 Evaluation Strategies • A term may have many redexes. Evaluation strategies can be used to limit the number of ways in which a term can be reduced. • An evaluation strategy is deterministic, if it allows reduction with at most one redex, for any term. • Examples: - full beta reduction - normal order - call by name - call by value, etc CS5205 Introduction 19 Full Beta Reduction • Any redex can be chosen, and evaluation proceeds until no more redexes found. • Example: (lx.x) ((lx.x) (lz. (lx.x) z)) denoted by id (id (lz. id z)) Three possible redexes to choose: id (id (lz. id z)) id (id (lz. id z)) id (id (lz. id z)) • Reduction: CS5205 id (id (lz. id z)) ! id (id (lz.z)) ! id (lz.z) ! lz.z ! Introduction 20 Normal Order Reduction • Deterministic strategy which chooses the leftmost, outermost redex, until no more redexes. • Example Reduction: id (id (lz. id z)) ! id (lz. id z)) ! lz.id z ! lz.z ! CS5205 Introduction 21 Call by Name Reduction • Chooses the leftmost, outermost redex, but never reduces inside abstractions. • Example: id (id (lz. id z)) ! id (lz. id z)) ! lz.id z ! CS5205 Introduction 22 Call by Value Reduction • Chooses the leftmost, innermost redex whose RHS is a value; and never reduces inside abstractions. • Example: id (id (lz. id z)) ! id (lz. id z) ! lz.id z ! CS5205 Introduction 23 Strict vs Non-Strict Languages • Strict languages always evaluate all arguments to function before entering call. They employ call-by-value evaluation (e.g. C, Java, ML). • Non-strict languages will enter function call and only evaluate the arguments as they are required. Call-by-name (e.g. Algol-60) and call-by-need (e.g. Haskell) are possible evaluation strategies, with the latter avoiding the reevaluation of arguments. • In the case of call-by-name, the evaluation of argument occurs with each parameter access. CS5205 Introduction 24 Programming Techniques in l-Calculus • Multiple arguments. • Church Booleans. • Pairs. • Church Numerals. • Recursion. • Extended Calculus CS5205 Introduction 25 Multiple Arguments • Pass multiple arguments one by one using lambda abstraction as intermediate results. The process is also known as currying. • Example: f = l(x,y).s f = l x. (l y. s) Application: f(v,w) (f v) w requires pairs as primitve types CS5205 requires higher order feature Introduction 26 Church Booleans • Church’s encodings for true/false type with a conditional: true false if • = l t. l f. t = l t. l f. f = l l. l m. l n. l m n Example: if true v w = (l l. l m. l n. l m n) true v w ! true v w = (l t. l f. t) v w ! v • Boolean and operation can be defined as: and = l a. l b. if a b false = l a. l b. (l l. l m. l n. l m n) a b false = l a. l b. a b false CS5205 Introduction 27 Pairs • Define the functions pair to construct a pair of values, fst to get the first component and snd to get the second component of a given pair as follows: pair fst snd = l f. l s. l b. b f s = l p. p true = l p. p false • Example: snd (pair c d) = (l p. p false) ((l f. l s. l b. b f s) c d) ! (l p. p false) (l b. b c d) ! (l b. b c d) false ! false c d ! d CS5205 Introduction 28 Church Numerals • Numbers can be encoded by: c0 c1 c2 c3 CS5205 = = = = : l s. l z. z l s. l z. s z l s. l z. s (s z) l s. l z. s (s (s z)) Introduction 29 Church Numerals • Successor function can be defined as: succ = l n. l s. l z. s (n s z) Example: succ c1 = (l n. l s. l z. s (n s z)) (l s. l z. s z) l s. l z. s ((l s. l z. s z) s z) ! l s. l z. s (s z) succ c2 = l n. l s. l z. s (n s z) (l s. l z. s (s z)) ! l s. l z. s ((l s. l z. s (s z)) s z) ! l s. l z. s (s (s z)) CS5205 Introduction 30 Church Numerals • Other Arithmetic Operations: plus = l m. l n. l s. l z. m s (n s z) times = l m. l n. m (plus n) c0 iszero = l m. m (l x. false) true • Exercise : Try out the following. plus c1 x times c0 x times x c1 iszero c0 iszero c2 CS5205 Introduction 31 Recursion • Some terms go into a loop and do not have normal form. Example: (l x. x x) (l x. x x) ! (l x. x x) (l x. x x) ! … • However, others have an interesting property fix = λf. (λx. f (x x)) (λx. f (x x)) fix = l f. (l x. f (l y. x x y)) (l x. f (l y. x x y)) that returns a fix-point for a given functional. Given x =hx x is fix-point of h That is: CS5205 = fix h fix h ! h (fix h) ! h (h (fix h)) ! … Introduction 32 Example - Factorial • We can define factorial as: fact = l n. if (n<=1) then 1 else times n (fact (pred n)) = (l h. l n. if (n<=1) then 1 else times n (h (pred n))) fact = fix (l h. l n. if (n<=1) then 1 else times n (h (pred n))) CS5205 Introduction 33 Example - Factorial • • Recall: fact = fix (l h. l n. if (n<=1) then 1 else times n (h (pred n))) Let g = (l h. l n. if (n<=1) then 1 else times n (h (pred n))) Example reduction: fact 3 = = = = = = = = CS5205 fix g 3 g (fix g) 3 times 3 ((fix g) (pred 3)) times 3 (g (fix g) 2) times 3 (times 2 ((fix g) (pred 2))) times 3 (times 2 (g (fix g) 1)) times 3 (times 2 1) 6 Introduction 34 Enriching the Calculus • We can add constants and built-in primitives to enrich lcalculus. For example, we can add boolean and arithmetic constants and primitives (e.g. true, false, if, zero, succ, iszero, pred) into an enriched language we call lNB: • Example: l x. succ (succ x) 2 lNB l x. true 2 lNB CS5205 Introduction 35 Formal Treatment of Lambda Calculus • Let V be a countable set of variable names. The set of terms is the smallest set T such that: 1. x 2 T for every x 2 V 2. if t1 2 T and x 2 V, then l x. t1 2 T 3. if t1 2 T and t2 2 T, then t1 t2 2 T • Recall syntax of lambda calculus: t ::= terms x variable l x.t abstraction tt application CS5205 Introduction 36 Free Variables • The set of free variables of a term t is defined as: FV(x) = {x} FV(l x.t) = FV(t) \ {x} FV(t1 t2) = FV(t1) [ FV(t2) CS5205 Introduction 37 Substitution • Works when free variables are replaced by term that does not clash: [x a l z. z w] (l y.x) = (l y. l z. z w) • However, problem if there is name capture/clash: [x a l z. z w] (l x.x) (l x. l z. z w) [x a l z. z w] (l w.x) (l w. l z. z w) CS5205 Introduction 38 Formal Defn of Substitution [x a s] x [x a s] y [x a s] (t1 t2) = s if y=x = y if yx = ([x a s] t1) ([x a s] t2) [x a s] (l y.t) = l y.t if y=x [x a s] (l y.t) = l y. [x a s] t if y x Æ y FV(s) [x a s] (l y.t) = [x a s] (l z. [y a z] t) if y x Æ y 2 FV(s) Æ fresh z CS5205 Introduction 39 Syntax of Lambda Calculus • Term: t ::= tt terms variable abstraction application x l x.t value variable abstraction value x l x.t • Value: v ::= CS5205 Introduction 40 Call-by-Value Semantics premise conclusion t1 ! t’1 (E-App1) t2 ! t’2 (E-App2) t1 t2 ! t’1 t2 v1 t2 ! v1 t’2 (l x.t) v ! [x a v] t CS5205 Introduction (E-AppAbs) 41 Call-by-Name Semantics t1 ! t’1 (E-App1) t1 t2 ! t’1 t2 (l x.t) t2 ! [x a t2] t CS5205 Introduction (E-AppAbs) 42 Getting Stuck • Evaluation can get stuck. (Note that only values are labstraction) e.g. (x y) • In extended lambda calculus, evaluation can also get stuck due to the absence of certain primitive rules. (l x. succ x) true ! succ true ! CS5205 Introduction 43 Boolean-Enriched Lambda Calculus • Term: t ::= x l x.t tt true false if t then t else t • Value: v ::= l x.t true false CS5205 terms variable abstraction application constant true constant false conditional value abstraction value true value false value Introduction 44 Key Ideas • Exact typing impossible. if <long and tricky expr> then true else (l x.x) • Need to introduce function type, but need argument and result types. if true then (l x.true) else (l x.x) CS5205 Introduction 45 Simple Types • The set of simple types over the type Bool is generated by the following grammar: • T ::= Bool T!T • types type of booleans type of functions ! is right-associative: T1 ! T2 ! T3 CS5205 denotes Introduction T1 ! (T2 ! T3) 46 Implicit or Explicit Typing • Languages in which the programmer declares all types are called explicitly typed. Languages where a typechecker infers (almost) all types is called implicitly typed. • Explicitly-typed languages places onus on programmer but are usually better documented. Also, compile-time analysis is simplified. CS5205 Introduction 47 Explicitly Typed Lambda Calculus • t ::= terms … l x : T.t … • v ::= • T ::= l x : T.t … Bool T!T CS5205 abstraction value abstraction value types type of booleans type of functions Introduction 48 Examples true l x:Bool . x (l x:Bool . x) true if false then (l x:Bool . True) else (l x:Bool . x) CS5205 Introduction 49