Why Do Monads Matter? Derek Wright London Haskell user group 24th October 2012 Welcome to London Haskell 2.0 • I am not an expert... • I am an amateur I Haskell • Give a talk you will learn more.... Why Do Monads Matter? Sententia cdsmithus • Blog post by Chris Smith 18th April 2012 • http://cdsmith.wordpress.com/ • All the good stuff is Chris’s And all the mistakes are mine Why Do Monads Matter? Category Theory for Software Development? • Match made in heaven? Or abstraction distraction? • Have you tried to learn Monads and they didn’t click? Functions for Computer Programmers • Quiz: Do programmers use functions? • Yes! Of course they do... • “That’s like asking if carpenters use nails!” Functions for Mathematicians • A bit more complicated... • A function is just an association of input values to output values Functions? • Functions that return random numbers • Functions that return one answer on Sundays, but a different answer on Mondays, yet another on Tuesdays... • Functions that cause words to appear on the screen every time you calculate their values Functions ? • Something different? • They have parameters (domain) • They return values (range) • And we can compose them f composed with g is: (f ◦ g)(x) = f ( g (x)) Composition • Composition puts together two functions to make a new function • But the range of the first and the domain of the second must match Executive Summary So Far 1) When computer programmers talk about functions, they do not mean exactly what mathematicians do 2) What they do mean is the idea of having inputs (domains), outputs (ranges), and most importantly composition Along Came The Category • Collections of “Objects” (you should think of sets) • and “Arrows” (you should think of functions between sets) • where each arrow has a domain and a range • each object has an “Identity” arrow (think of the identity function, where f(x) = x) • and arrows can be composed when the domains and ranges match up right • If you compose any function with an identity, it doesn’t actually change • Composing functions is associative Categories in 1 Diagram Example Categories “Things from algebra, like groups and rings and vector spaces; things from analysis, like metric spaces and topological spaces; things from combinatorics, like elements of partially ordered sets and paths in graphs; things from formal logic and foundations, like proofs and propositions.” The Four Horsemen of the Catapocalypse The First Horseman: Failure • A function might fail... • You might get an answer • But you might get an error The Second Horseman: Dependence • Functions of mathematics are nice and selfcontained • Computer programs are messes of configuration • Lots of computer programs depend on information that is “common knowledge” • We want every function to be passed the configuration The Third Horseman: Uncertainty • Non-determinism • A normal function associates each input to one output • A non-deterministic function associates each input to some number of possible outputs • e.g. parsing, search, verification, querying... • Often solved with special-purpose languages The Fourth Horseman: Destruction • The only observable of evaluating a mathematical function is the value • Computer programs have effects • “... these destructive effects are in a sense the whole point of computer programming; a program that has no observable effects would not be worth running!” • Order matters Back To The Function • We build software that: might fail, has to deal with extra context, models non-deterministic choice and sometimes has observable effects on the world • Seems that we’ve left the nice and neat world of mathematical functions far behind... • We have NOT! Functioning With Failure • The results of a a function that could fail is either: Successes with are the intended result + Failures with are descriptions of why the attempt failed • So for any set A we can define Err(A) that includes both Functioning With Dependence • For a set A we will define the set Pref(A) that is: When given a value from A Get back another function that maps from application settings to the set B Functioning With Uncertainty • Not one specific answer, many possible answers • For each set A, define P(A) to be the power set of A Non-deterministic function from A to B is just an ordinary function from A to P(B) Functioning With Destruction • For each set A we define IO(A) An element of the set IO(A) is a list of instructions for obtaining a member of A It is not a member of A merely a way to obtain one and that procedure might have any number of observable effects But what about composition? • Looks like we have lost composition... • Those function domains and ranges don’t match up • We can’t compose them! Oh no… Hold Your Horses, Heinrich Kleisli to the Rescue! • All is not lost, just haven’t been told how to compose these “special” functions • “special” functions named: Kleisli arrows • Since they are just functions we can compose them as functions • But they are “special” and we can compose them as Kleisli arrows, too • Sets are a category but we want a new kind of category Kleisli Category Kleisli Category • • • • • What do we need: Objects = Same, just sets of things Arrows = Kleisli arrows Identities = ? Composition = ? Identities and Composition for Err • From a failure Kleisli arrow from A to B, and one from B to C • We want to compose them into a Kleisli arrow from A to C • We have an ordinary function from A to Err(B) and a function from B to Err(C) • How do we compose them? ... Identities and Composition for Err • Central idea of error handling is: if the first function gives an error, then we should stop and report the error • Only if first function succeeds, continue on to the second function (using the result of the first function as input to the second) and give the result from that • If g(x) is an error, then (f ◦ g)(x) = g(x) • If g(x) is a success, then (f ◦ g)(x) = f(g(x)) Identities and Composition for Err • • • • • Identity Kleisli arrows? Don’t do anything Identities are functions from A to Err(A) Just the function f(x) = x Never return an error only a successful result Identities and Composition for Dependence • Functions from A to Pref(B) • Equivalent to adding a new parameter for the application preferences • Two functions that both need the application preferences • Give the same preferences to both • Kleisli identity gets extra preferences parameter but ignores it. Just returns its input Identities and Composition for Non-Determinism • Functions from A to P(B), the power set of B • Try all possible values that might exist at this point and collect the results from all of them • Composition applies 2nd function to each possible result of the 1st then the results are merged together into a single set • Identities return one-element sets containing input Identities and Composition for Destructive Effects • Functions from A to IO(B) • Combine instructions by following them in a stepby-step manner 1st do one, then the next • So the composition writes instructions to perform the 1st action, looks up the 2nd action from the result and then perform the second action, in that order • Kleisli identity is just an instruction to do nothing at all and announce the input as the result Kleisli Category for each example • Created a new category, the Kleisli category, for each example • Each has their own function-like things • and its own composition • and identities • That express the unique nature of each specific problem Monads?!? • And that’s why we should care about monads • Monads?!? We’ve just learned about monads. We simply forgot to use the word. What’s This Got To Do With Monads? • To make sure they are monads in the conventional way, we’d have to work pretty hard: • 1st prove that they are functors. • Then build two natural transformations called η (eta) and µ (mu) and prove that they are natural • Finally, prove the three monad laws What’s This Got To Do With Monads? • Heinrich Kleisli pointed out that if you can build a category like we did, whose arrows are just functions with a modified range then your category is guaranteed to also give you a monad • Good because computer programmers care more about their Kleisli arrows than the mathematician’s idea of monads Kliesli category to Monad • Traditional definition of a monad requires that it is a functor. Given a function f from A to B need to construct a function Err(f) from Err(A) to Err(B) Kliesli category to Monad • Natural transformation η (eta) from the identity Functor to Err Kliesli category to Monad • Finally, a natural transformation µ (mu) from Err² to Err Monad to Kliesli category • Going in the opposite direction, from a monad to the Kliesli category, is easier • Given monad Err, with η and µ, the Kliesli category is constructed as follows: • The identities are just the components of η Monad to Kliesli category • Given a function f from A to Err(B) and a function g from B to Err(C), compose the two as µ ◦ Err(g) ◦ f Executive Summary so far 1) Computer programmers like to work by composing things together, which we call functions 2) They aren’t functions in the obvious way… but make a category 3) They are functions after all, but only if you change the ranges into something weirder 4) The category that they form is a Kleisli category and it’s another way of looking at monads 5) These monads / Kleisli categories nicely describe the techniques we use to solve practical problems Joining The Monadic Revolution • What about the humble computer programmer • Monads are making their way into practical problems too • In the past, Kleisli arrows were built into our programming languages ... want something different, too bad The Past: Error Handling • The Err monad, bar some details, is structured exception handling • But programmers are left to do composition by hand • No way to get a better Kleisli arrow without changing programming language The Present: Global Variables and Context • In the past, we had global variables • OOP tried to alleviate the problem, by having functions run in a specific “object” that serves as their context • A better Kleisli arrow, but not a perfect answer The Near Future (/ Present) • Purity, Effects, Parallelism, Non-Determinism, Continuations, and More! • Everything is already possible • Parallelism needs a Kleisli arrow that less powerful • Want to separate the destructive updates from pure functions • Need more than one kind of Kleisli arrow, in the same language!