Programming Paradigms Concepts Beginning to program Beginning programmers have to learn about loops, if statements, and input/output; and maybe some form of function or procedure Programming concepts are easy—it’s writing programs that’s hard! These concepts are easy Tell a beginner that a loop does the same thing over and over, and they have no trouble understanding that The same goes for if statements and functions So why do beginners have so much trouble writing programs? Understanding the concepts individually is no big deal Learning to program is all about making these concepts an automatic (invisible) part of your mental toolkit In this course I’ve introduced more programming concepts Again, the concepts are easy, and (once understood) are just common sense Hopefully, some of these concepts become invisible to you! 2 What we did not cover The ecosystem surrounding a programming language is very important Is there a community of programmers for that language who can help when you have trouble? Is there a decent IDE for the language? Are there good debugging tools? Are there good build tools for larger programs? Or at the very least, an editor that does syntax coloring? make, sbt, ant, leinengen, etc. Is there a good web application framework? Ruby/Rails, Python/Django, Scala/Lift, Java/Scala/Play!2 3 The Blub paradox These languages are full of unnecessary weird stuff My language, Blub, has everything I ever want or need These languages are defective, because they don’t have feature x, which I use all the time 4 A declarative language Most programming languages are imperative—you tell the computer what to do to solve a problem Prolog is declarative—you give it the data it needs, and it solves the problem for you Consider: append([], List, List). append([Head | Tail], List, [Head | More]) :append(Tail, List, More). This defines what it means append lists You can use it to append two lists You can use it to find what to append to a list to get another list You can use it to find what list to append to to get another list You can test whether the relation holds between three lists 5 A constraint satisfaction language Prolog is also one of a number of constraint satisfaction languages You tell it the constraints on a problem (e.g. the solution must be someone who is both female and rich), and it finds a solution that satisfies those constraints A “logic puzzle” is nothing more than a set of constraints (e.g. every man has a different tie) 6 A homoiconic language Prolog is homoiconic—that is, there is no distinction between “statements” in the language and the “data” that the languages processes When you provide “input” to a Prolog program (e.g. for an adventure game), you use the same syntax as when you write the program We haven’t emphasized homoiconicity in Prolog, because it is more important in some of the other languages we will be studying 7 Prolog does backtracking You don’t have to implement backtracking in Prolog; the language does it for you The only other “language” I know of that does this is Regular Expressions, which are a “sublanguage” of most modern programming languages loves(chuck, X) call fail female(X) rich(X) exit redo 8 Prolog uses unification The basic rules of unification are: Any value can be unified with itself A variable can be unified with another variable A variable can be unified with any value Two different structures can be unified if their constituents can be unified Lists are a particularly important kind of structure A variable can be unified with a structure containing that same variable Unification is reflexive, symmetric, and transitive Parameter transmission is by unification, not assignment 9 Prolog is a theorem prover Prolog is an implementation of resolution theorem proving in a programming language Because it is based on logic, there is very little that is ad hoc or arbitrary about Prolog 10 Resolution A clause is a disjunction (“or”) of zero or more literals, some or all of which may be negated sinks(X) dissolves(X, water) ¬denser(X, water) Any expression in the predicate calculus can be put into clause form X Y can be rewritten as X Y Conjunctions (“ands”) can be rewritten as separate clauses The existential quantifier can be replaced by a Skolem function If the existential quantifier is under control of a universal quantifier, replace it with a function of the universally quantified variable Otherwise, just replace with a constant (whose value need not be known) Universal quantifiers, , can be dropped Here is the resolution principle: X someLiterals X someOtherLiterals ---------------------------------------------conclude: someLiterals someOtherLiterals Clauses are closed under resolution From and 11 Abstract syntax trees Compilers turn source code into ASTs (Abstract Syntax Trees) This is a major first step in virtually all compilers and interpreters Lisp/Clojure syntax is an AST Parenthesized expressions directly represent tree structures Basic functional programming Functions are values (“first-class objects”) Functions have no side effects May be passed as parameters to other functions May be returned as the result of a function call May be stored in variables or data structures Given the same parameters, a function returns the same result Referential transparency: The result of a function call may be substituted for the function call Data is immutable and persistent Immutable data makes concurrency much easier Persistent means that structure is shared Four rules for doing recursion Do the base cases first Recur only with simpler cases Don't modify and use non-local variables You can modify them or use them, just not both Remember, parameters count as local variables, but if a parameter is a reference to an object, only the reference is local—not the referenced object Don't look down 14 Recursion Loops and recursion are equivalent in power If a function is tail-recursive, it can be automatically turned into a loop (thus saving stack frames) Anything that can be done with a loop can be done recursively, and vice versa Tail recursion is when the recursion is the last thing done in every branch of the function Loops are less useful when data is immutable Loops are used primarily to modify data When data is immutable, the substitution rule applies: Any variable or function call may be replaced by its value The substitution rule simplifies reasoning about programs Tail call elimination Recursive functions can often be rewritten to be tail recursive This is done by creating a helper function that takes an additional “accumulator” parameter Non-tail recursive function to find the factorial: (defn factorial-1 [n] (if (= n 1) 1 (* n (factorial-1 (dec n))) ) Tail recursive function to find the factorial: (defn factorial-2 ([n] (factorial-2 1 n)) ([acc n] (if (= n 1) acc (recur (* n acc) (dec n)) ) ) ) Higher-order functions add expressiveness A higher-order function is a function that takes a function as an argument, returns a function as its vallue, or both Of course, higher-order functions do not increase the number of things that can be computed Any Turing complete language can compute anything that can be computed It doesn’t take much for a language to be Turing complete Higher-order functions can take the place of explicit recursions (or explicit loops) Using higher-order functions typically makes code shorter and clearer Common higher-order functions Almost every functional languages has these features and functions: Anonymous (literal) functions, so a function doesn’t have to be defined independently, but can be placed directly as an argument to a higher-order function (map function sequence) applies the function to each element of the sequence, yielding a sequence of the results (filter predicate sequence) applies the predicate to each element of the sequence, yielding a sequence of the elements that satisfy the predicate (reduce initial-value binary-function sequence) applies the binaryfunction to each pair of elements of the sequence, starting with aninitialvalue , and yielding a single value List comprehensions combine and may simplify the above user=> (for [x (take 10 (iterate inc 1))] (* x x)) (1 4 9 16 25 36 49 64 81 100) Closures When a function definition uses variables that are not parameters or local variables, those variables are “closed over” and retained by the function The function uses those variables, not the value the variables had when the function was created user=> (defn rangechecker [min max] (fn [num] (and (>= num min) (<= num max))) ) #'user/rangechecker Notice that the function being returned gets the values of min and max from the environment, not as parameters user=> (def in-range? (rangechecker 0 100)) #'user/in-range? Currying and function composition Currying is absorbing a parameter into a function to make a new function user=> (def hundred-times (partial * 100)) #'user/hundred-times Function composition is combining two or more functions into a new function user=> (def third (comp first rest rest)) #'user/third Persistence and laziness A persistent data structure is one that is itself immutable, but can be modified to create a “new” data structure The original and the new data structure share structure to minimize copying time and wasted storage A lazy data structure is one where parts of it do not exist until they are accessed They are implemented (by Clojure) as functions that return elements of a sequence only when requested This allows you to have “infinite” data structures Macros, metaprogramming, homoiconicity A macro is like a function whose result is code Metaprogramming is using programs to write programs Arguments to a macro are not evaluated Macro calls are evaluated at compile time The return value of a macro should be executable code This is the primary use of macros in Clojure Macros are used to implement DSLs (Domain Specific Languages) Homoiconicity is when there is no distinction between code and data Lisp code is lists; Lisp data is lists Homoiconicity greatly simplifies metaprogramming Prolog and REBOL are also homoiconic languages Software Transactional Memory A transaction takes a private copy of any reference it needs Since data structures are persistent, this is not expensive The transaction works with this private copy If the transaction completes its work, and the original reference has not been changed (by some other transaction), then the new values are copied back atomically to the original reference But if, during the transaction, the original data is changed, the transaction will automatically be retried with the changed data However, if the transaction throws an exception, it will abort without a retry Functional Programming (FP) Haskell is a purely functional language Almost all modern languages have some functional aspects In FP, Functions are first-class objects. That is, they are values, just like other objects are values, and can be treated as such Functions can be assigned to variables, Functions can be passed as parameters to higher-order functions, Functions can be returned as results of functions Functions can be manipulated to create new functions There are function literals One author suggests that the most important characteristic of a functional language is that it be single assignment Everything else (?) follows from this 24 FP rules and benefits To get the benefits of functional programming, functions should be “real” functions, in the mathematical sense Functions should be free of side effects (input/output, mutable state) Functions should be deterministic The arguments to a function, and nothing else, determines its result If a function is side-effect free and deterministic, it has referential transparency—all calls to the function could be replaced in the program text by the result of the function Benefits: Because functions don’t make use of external, mutable data, they are easier to reason about, both mathematically and informally Because immutable data is a natural consequence of functional programming, correct concurrent programming becomes feasible Functional programming adds some powerful and convenient tools to the programmer’s toolbox (such as map, filter, and reduce) 25 Currying Currying absorbs an argument into a function, producing a function with one fewer arguments All Haskell functions are curried, and take one argument (or none at all) A partially applied function is one that has absorbed one or more of its arguments, and requires more arguments before it produces a non-function result A partial function is one that is not defined for all possible inputs Type systems Static type checking is ensuring at compile time that all variables have known, fixed types, and that all operations used are appropriate for that type In dynamic type checking, variables may hold any type of value, and the types are checked only when an operation is applied In duck typing, the type of a variable is inconsequential, so long as the requested operations can be performed Programs don’t check the type, they check whether the desired operations are (currently) available Haskell uses static types, but uses Hindley-Milner type checking to determine the types for itself 27 Static and dynamic languages In a static language, all the types and all the methods are determined before the program runs Some examples are C, C++, Java, Scala, Fortran, … Generally regarded as better for programming in the large Faster, because compiler optimization techniques are better developed (this is changing) In an dynamic language, types can change, methods can be created (and used) or destroyed at runtime Dynamic languages typically have eval (or equivalent) Some examples are Perl, Lisp(s), Prolog, REBOL, Ruby, Javascript, Python, … Generally regarded as better for programming in the small Often faster in practice, because less code A lot of work is currently being done to optimize JavaScript 28 Scripting languages A scripting language is a language that automates tasks that would otherwise be done manually at the command line The typical use is to “script” together other programs, including those provided by the operating system (cd, chmod, etc.) Scripting languages are typically dynamic and interpreted Perl is the most popular, but other languages can be so used (AppleScript, Bash, PowerShell, Python, Ruby, even Scala) 29 JVM Languages Designed for the JVM Java 8 (for the patient and hopeful developer) Scala Clojure Kotlin Ceylon Groovy Fantom Mirah Existing languages ported to the JVM Ruby (JRuby) Python (Jython) Javascript (Rhino and others) Erlang (Erjang) Scheme (different implementations like Kawa) Source: http://www.infoq.com/research/next-jvm-language 30 Typeclasses A Haskell typeclass is like a Java interface, or a Scala trait—it tells what functions an object can support Some typeclasses and what they support: Eq -- == and /= Ord -- < <= >= > Num -- + - * / and others Show -- show (enables printing as a string) Read -- read (conversion from a string to something else) Functor -- fmap (enables mapping over things) Lists belong to the Functor typeclass Monad -- >>= >> return fail The importance of a typeclass is that it can be mixed in to other classes, giving them features that would otherwise have to be programmed in each case Pattern Types Just as Prolog does parameter passing by unification, Haskell does parameter passing by pattern matching There isn’t a lot of difference between the two techniques A variable will match anything A wildcard, _, will match anything, but you can’t use the matched value A constant will match only that value Tuples will match tuples, if same length and constituents match Lists will match lists, if same length and constituents match (h:t) will match a nonempty list whose head is h and whose tail is t “As-patterns” have the form w@pattern However, the pattern may specify a list of arbitrary length When the pattern matches, the w matches the whole of the thing matched (n+k) matches any value equal to or greater than k; n is k less than the value matched Monads A monad consists of three things: A type constructor M This basically defines the form that M can take A bind operation, >>= (Haskell) or flatMap (Scala) For a monad m, this is a function m a -> (a -> m b) -> m b This function takes an a out of monad, applies a function a -> m b to it, and returns the new monad m b A function that puts a value into a monad: a -> m a Scala example: scala> val v = Some(2.0) v: Some[Double] = Some(2.0) scala> def root(n: Double) = if (n >= 0) Some(math.sqrt(n)) else None root: (n: Double)Option[Double] scala> v flatMap root res0: Option[Double] = Some(1.4142135623730951) REBOL REBOL is yet another functional, homoiconic language However, REBOL does some things quite differently REBOL is a small language with a simple syntax, but which nevertheless has over 45 useful data types REBOL has no reserved words REBOL operations are prefix, and each operation “knows” how many arguments it takes sum: add 3 5 print sum is equivalent to (sum: (add 3 5)) (print sum) Lists (“blocks”) in REBOL are doubly-linked, so traversing lists is conceptually quite different than in most languages 34 Postfix notation and Forth Postfix (or “Polish”) notation is a way of writing expressions that does not require parentheses or rules of precedence Forth is a stack-based (hence, postfix) language For example, (a + b) * (c - d) becomes a b + c d - * Postfix expressions are easily computed using a simple stack It consists of arithmetic functions along with many additional functions, including functions to manipulate the stack The syntax is about the simplest possible: words, separated by whitespace Implementation requires only a stack and a “dictionary” of functions Forth is most suitable for extremely memory-limited situations 35 Programming paradigms According to Wikipedia, there are five main paradigms: imperative, functional, object-oriented, logic and symbolic programming Some imperative languages are Fortran, Basic, and C Functional languages include ML, Ocaml, and Haskell Object-oriented languages include Java, C++, and C# Prolog and its derivatives are the main logic languages Symbolic programming languages include Prolog, Lisp, and Clojure A multiparadigm language is one with significant support for more than one paradigm Scala is both object-oriented and functional This makes it a good “capstone” language for CIS 554 Oz is a research language designed to include all paradigms 36 Scala I like to say: Scala = Java – cruft + functional programming Scala is based on Java but borrows a lot from Haskell All the functional stuff Hadley-Milner type inference (to the extent possible) Monads. Monads are everywhere in Scala, but “under the hood” where they “just work” Scala borrows actors from Erlang, which we unfortunately did not cover Erlang, in turn, is based on Prolog 37 Scala’s cruft removal In many ways, Scala is a simplification of Java Semicolons, and many other bits of punctuation, are unnecessary Most type declarations are unnecessary Separate constructors are not necessary Static variables and methods do not exist However, Scala’s for loop has much more power Scala’s match, unlike Java’s switch, does not “fall through” Their place is taken by objects, which is arguably better There are no checked exceptions == works! The exact equivalent of the for loop has simpler, better syntax There is no “disappearing constructor” problem Plus, match is far more general than switch Scala has raw strings—great for multiline strings and regular expressions Scala allows many more kinds of nesting (like, methods within methods) 38 Scala’s corrections Scala has corrected some of Java’s mistakes Fall-through in switch statements was always a bad idea Tony Hoare, who invented null, calls it his “billion-dollar mistake” Scala uses a type lattice rather than a type hierarchy, which solves some technical problems Scala has null, but only so it can talk to Java Instead, Scala has the Option monad Among others, null acted as a subtype of every object type! Scala eliminated Java’s depth subtyping, which is not type safe, and which generics made unnecessary 39 Scala’s additions Scala has: Functional programming, including map, filter, fold, and for-comprehensions Higher-order functions largely eliminate the need for loops (and are usually more readable) Loops are one of the main reasons for needing mutable variables Scala’s “for loops” are compiled into higher-order functions Pattern matching (in many places, not just match expressions) Actors (a well-tested technology, from Erlang) 40 Actors An actor is an independent flow of control An actor does not share its data with any other process This means you can write it as a simple sequential process, and avoid a huge number of problems that result from shared state However: It is possible to share state; it’s just a very bad idea Any process can send a message to an actor with the syntax actor ! message An actor has a “mailbox” in which it receives messages An actor will process its messages one at a time, in the order that it receives them, and use pattern matching to decide what to do with each message You can think of an actor as a Thread with extra features Except: Messages which don’t match any pattern are ignored, but remain in the mailbox (this is bad) An actor doesn’t do anything unless/until it receives a message 41 Covariance, contravariance, invariance Covariance and contravariance are properties of collections A collection of values of type A is covariant if it may be treated as a collection of values of some supertype of A A collection of values of type A is contravariant if it may be treated as a collection of values of some subtype of A That is, you can use a subtype of the expected type Lists are covariant because a List[Dog] may be treated as if it were a List[Animal] class List [+A] extends LinearSeq[A] That is, you can use a supertype of the expected type trait Function1 [-T1, +R] extends AnyRef This trait defines a function that is contravariant in its argument type, and covariant in its return type A collection is invariant if it is neither covariant or contravariant Functions are contravariant in their argument types and co-variant in their return types 42 Scala’s problems As I see it, Scala has three main problems: 1. The documentation is, shall we say, intimidating? Most of the time you don’t need to know about contravariance and IterableOnce The API could benefit from “progressive disclosure” 2. It needs a better infrastructure They’ve made a good start on an IDE, but there are a lot of tools (good debuggers, static analysis, etc.) that are still lacking 3. It doesn’t have a major company behind it Ok, it has Twitter, but they are (mostly) just users École Polytechnique Fédérale de Lausanne and now TypeSafe Mostly, it has to survive on its merits 43 Summary of summary “The world will little note, nor long remember, what we say here…” We haven’t spent enough time on any of these languages for you to have gotten good at them (Though I hope I’ve given you a good start on Scala ) So what’s the point? In previous years I’ve had students tell me that the course has given them the confidence to take on new languages If I’ve succeeded in that, I don’t feel this course was a waste of time, and I hope you don’t, either 44 The final exam We have covered a lot of languages, and I don’t expect you to remember the syntax of them all However, there may be questions where you will need to read the syntax well enough to answer questions Exception: There will be questions that require you to know the basic syntax and use of regular expressions 45 Genealogy Simula objects Lisp C functional programming syntax ML Haskell Prolog Smalltalk pattern matching C++ Clojure Erlang Java Actors Scala 46 The End 47