7-1 Theory of Computation Chapter 7 Recursion Hilbert (1925) suggested that two strategies are sufficient, namely generalised composition (gc) and primitive recursion (pr). Simple Composition We look at this first. If h: T V and g: V W then h and g can be composed to give f, where f: T W, and t T. f(t) = g (h (t)) Generalised Composition If h1,h2,...,hn are all functions of m arguments, and g is a function of n arguments, then we can construct a function f from g and h1,h2,...,hn by generalised composition, with f (x) = g ( h1 (x) ,h2(x) ,...,hn(x)) where x = (x1,x2,...,xm), i.e. f is a function of m arguments. We write this as f = g (h1,h2,...,hn) Primitive Recursion This is based in the principle of mathematical induction. Definition If g and h are given functions, we can construct a function f from g and h by primitive recursion as follows: f (x, 0) = g (x) f (x, y+1) = h (x, y, f (x, y) ) where x = (x1,x2,...,xn ) for some n >0. The function g has n arguments, f has n + 1 arguments and h has n + 2 arguments. Often h does not depend on the arguments x and y, and so involves a projection function to select just the final argument. Example - addition The addition function can be defined by sum (x, 0) =x sum (x, y+1) = succ (sum(x,y)) 7-1 7-2 Theory of Computation Example - multiplication The multiplication function mult (x,y) = x*y can be defined: mult (x,0) mult (x, y+1) = = zero (x) sum (x, mult (x,y)) Example - factorial fac (0) = 1 fac (y+1) =(y+1) * fac (y) E.g. fac (3) = 3 * fac (2) = 3 * 2 * fac (1) = 3 * 2 * 1 * fac(0) = 6 Our definition of primitive recursion only applies to functions of (at least) two parameters, so define fac1 (x, 0) = 1 fac1 (x, y+1) = (y+1) * fac1 (x,y) Then fac (z) =fac1 (one,P ) so that fac(z) = fac1(1,z) where the value 1 is unimportant. To show rigorously that fac1 can be defined using gc and pr, fac3 fac2 = = fac1 (x,0) fac1 (x,y+1) succ (P ) mult (fac3,P ) = = one (x) fac2 (x,y,fac1(x,y)) General Recursion Consider Ackermann's function, defined by ack (x, y) = y + 1 ,x=0 = ack (x-1, 1) ,y=0 = ack (x-1, ack (x, y-1)) , otherwise This is computable, but it can be shown that it is not primitive recursive. We therefore need further strategies to permit the definition of all computable functions. 7-2 7-3 Theory of Computation Furthermore, there is no equivalent in the p.r.f.s to the non-termination of a Turing Machine. Consider for example the functions red (x,y) = x - y, red (x,y) = x , if x>=y otherwise (e.g. red (5,3) = 2, red (5,6) = 5) div (x, y) = 0 = succ (div (red(x,y), y)) , if x =0 , otherwise Then div (4,2) = succ (div (red(4,2), 2)) = succ (div (2, 2)) = succ (succ (div (red(2,2), 2))) = succ (succ (div (0, 2))) = succ (succ (0)) =2 However div (3,2) = succ (div (red(3,2), 2)) = succ (div (1, 2)) = succ (succ (div (red(1,2), 2))) = succ (succ (div (1, 2))) ... 'div' is recursively applied and either (a) (b) terminates when x = 0, or fails to terminate 'div' is therefore NOT a p.r.f. (it is a partial function, whereas all p.r.f.s are total). Partial functions can be computable, by which we mean that we can find every defined answer, and go on forever if undefined. 7-3 7-4 Theory of Computation An alternative way of defining 'div' is: div (x ,x ) the least y >= 0, if any, 1 2 such that x * (y+1) > x 2 1 i.e. div (x , x ) = ( y) (x * (y+1)> x ) 1 2 2 1 where is the minimization operator. Definition If f is a function of n+1 arguments, we can construct a function g of n arguments by minimization out of f as: g (x) = ( y) (f (x, y) = 0) where x = (x , x , ..., x ). 1 2 n Definition The functions that can be constructed from the set F = {zero, succ, pred, P n (1 ² i ² n) } i of base functions, together with the set C = {gc, pr, } of strategies (with the restriction that is only applied to total functions) are called the General Recursive Functions (GRF). These are equivalent to the computable functions. n They all map from N to N (where n > 0). Note Let g (x) = ( y) (f (x, y) = 0) If f is a GRF, and f is total, then g is a GRF which is partial. If f is a partial GRF, then g may not be a GRF. I.e., if f is a computable total function then g is computable, but if f is a computable partial function then g may not be computable. Example 7-4 7-5 Theory of Computation Define f (x, y) as follows: f (x, 0) = f (x, 1) = f (x, 2) = f (x, 3) = f (x, y) = 6 7 undefined (i.e. f is partial) 0 8 for all y >= 4 Define g (x) = ( y) (f (x, y) = 0) Clearly g (x) = 3 To calulate g (using a Turing Machine) we use a 'subroutine' to calculate f (x, y) for y = 0, 1, 2, .., intending to halt when the first value of y for which f (x, y) = 0 is found. However, since f (x, 2) is undefined, the subroutine fails to halt for y = 2 and the calculation of f (x, 3) never starts. Thus g is not computable. 7-5 8-6 Theory of Computation Chapter 8 Lambda-Calculus Introduction to -calculus The -calculus is a purely syntactic notation, consisting of Well-Formed Formulae (WFFs, or -expressions), defined by formal rules; and a set of Conversion Rules which enable one WFF to be converted to another. We regard two WFFs as being equivalent if one can be converted into the other by repeated use of one or more of the conversion rules. Although purely syntactic, it can be shown that the -calculus can be used to model the computable functions, i.e. it is equivalent to Turing Machines or the General Recursive Functions. This diagram commutes, that is, it doesn't matter whether a -expression is interpreted and then functional computation is used, or whether the conversion rules are applied first and then the -expression is interpreted. We shall return later to the idea of interpreting (or giving meaning to) -expressions; first we consider the -calculus as a purely syntactic system. Definition of -calculus A Well-Formed Formula (WFF) of the -calculus (also called a -expression) is one of the following: 1. A variable (a lower-case letter) e.g. x 2. The application (MN) of two WFFs M and N. When interpreted, this will be regarded as a function M applied to an argument N. 3. The abstraction ( x. M) of a WFF M where x is a variable. When interpreted, this will be regarded as a function with one formal parameter called x. The value returned by the function is M (which will probably involve x). Note Integers and arithmetic (+, - etc.) are not included in the -calculus, but they can be modelled (see later). Parentheses can often be omitted without ambiguity, in which case the following rules apply: 1. Application associates to the left, i.e. KMN – (K M) N 8-6 8-7 Theory of Computation 2. The "scope" of an abstraction is taken to extend as far as possible, consistent with parentheses (contrast this with the scope of a quantifier in predicate calculus, which is taken to be as small as possible). For example, ( x. ( y. (M N) ) ) could be written as ( x. ( y. M N ) ) or even as x. y. M N Free and Bound Variables An occurrence of variable x is said to be bound if it is in an abstraction of the form x. M, and free otherwise. Examples 1. ax 2. ( x.ax) 3. ( x.ax)x Both a and x occur free. Both occurrences of x are bound, The first two occurrences of x are a still occurs free. bound, the third is free. Substitution When representing function calls, it can be necessary to replace all the free occurrences of a given variable with a -expression. This corresponds to replacing a formal parameter with an actual parameter in programming. The notation used here is ([M/x] X). ([M/x] X) x in X. which means the expression formed when M replaces free occurrences of E.g. ([a/x] (y.x) ) – (y.a) 1. ( x . xz) a [a/x] xz converts to = az, 2. ( x . xx) a [a/x] xx converts to = aa, 3. ( x . xz) ( y . yz) converts to [( y . yz)/x] xz = ( y . yz) z, which converts to [z/y] yz = zz 8-7 8-8 Theory of Computation Evaluate ( x. y. yx) y z (which is ( ( x.( y. yx)) y) z First, evaluate ( x.( y. yx)) y Substituting y for x in y. yx seems to give ( y. yy), so the answer seems to be ( y. yy) z = zz whereas the correct result is of the form ( u. uy) z = zy The bound occurrences of y in y. yx must not be confused with the free occurrence of y in yz. Remedy: Rewrite y.yx as u.ux before the substitution is made. For, y.yx and u.ux are the same, up to the renaming of bound variables. Such name clashes may be prevented by the use of substitution rules. Conversion Rules We now give names to the rules which we have used to manipulate -expressions. -conversion (alpha-conversion) If y is not free in X then x.X cnv y.[y/x] X (Renaming of bound variables) -conversion (beta-conversion) ( x.M )N (Evaluation) cnv [N/x] M -conversion (eta-conversion) If x is not free in M, then (x.Mx) cnv M 8-8 8-9 Theory of Computation Notes: 1. -conversion permits the name of a bound variable to be changed, provided that no name clash is introduced. Compare with programming, where the name of a formal parameter can be changed provided that the new name does not clash with local or global variables. 2. Compare -conversion with the substitution of actual parameters for formal parameters in programs. 3. All rules are reversible. Each rules states that when one of the expressions (left or right) occurs, it may be replaced by the other. 4. -conversion and -conversion permit the elimination of abstractions. These are called the reduction rules, or reductions. We say that one expression may be reduced to another, and write, for example, A red B . We write A red B to mean that A may be reduced to B by a series of reduction (and possible some -conversions). A red B A cnv B An expression that can be reduced is called a redex. This for any WFFs M and N, (x.M) N is a -redex (x.Mx) is an -redex if x is not free in M We write (x.M) N red [N/x] M (x.Mx) red M An expression which contains no redexes is said to be in normal form. An expression which contains no redexes is said to be in normal form. If M red N and N is in normal form, then N is said to be the normal form of M, and may be interpreted to be the "value" of M. 8-9 8-10 Theory of Computation Examples 1. (x.y.y)ab = = (x.(y.y))ab ((x.(y.y))a)b the left) red (y.y) b red b 2. (f.x.f(fx))ab red (x.a(ax)) b red a(ab) 3. (x.xx)(x.xx) (Application binds to (x.xx) has the form (x.Mx), but is not -reducible since x occurs free in M An attempt at -reduction yields (x.xx)(x.xx) red (x.xx)(x.xx) So this expression has no normal form. Some expressions can not be reduced to normal form. Church-Rosser Theorems These theorems show that this is not possible, i.e. if two reduction sequences for the same expression both yield a normal form, then they yield the same normal form (up to renaming of bound variables). Arithmetic using the -calculus It is possible to interpret the -calculus, i.e. to assign meaning to certain -expressions. We shall consider how to represent the non-negative integers. Represent zero by – f.x.x and the successor function by succ – k.f.x.f(kfx) 8-10 8-11 Theory of Computation – succ 0 –(k.f.x.f(kfx)) (f.x.x) cnv f.x.f( (f.x.x) fx) 1 cnv cnv f.x.f( (x.x) x) f.x.fx 2– succ1 3– succ 2 cnv cnv f.x.f(fx) f.x.f(f(fx)) etc... 8-11 8-12 Theory of Computation Appendix taken from http://perl.plover.com/lambda/ Perl contains the -calculus -Calculus (pronounced `lambda calculus') is a model of computation invented by Alonzo Church in 1934. It's analogous to Turing machines, but it's both simpler and more practical. Where the Turing machine is something like a model of assembly language, the -calculus is a model of function application. Like Turing machines, it defines a simplified programming language that you can write real programs in. Writing Turing machine programs is like writing in assembly language, but writing -calculus programs is more like writing in a higher-level language, because it has functions. The two legal operations in the -calculus are to construct a function of one argument with a specified body, and to invoke one of these functions on an argument. What can be in the body of the function? Any legal expression, but expressions are limited to variables, function constructions, and function invocations. What can the argument be? It has to be another function; functions are all you have. With this tiny amount of machinery, we can construct a programming language that can express any computation that any other language can express. Unlike most popular programming languages, Perl is powerful enough to express the calculus directly, without the need to write a simulator. This means that if you want to try programming in the -calculus, you can do it directly in Perl, without having to implement a program to parse and evaluate -calculus expressions first. Perl's parser will parse -expresisons, if you write them properly, and its evaluator will evaluate them. In the -calculus, a function with formal parameter x and body B is denoted x.B. In Perl, we write sub { my $x = shift; B }. To apply the function P to an argument Q, the usual -calculus notation is just (P Q). In Perl, we write $P->($Q). The only expressions not of either of these two forms are simple variables. To apply the function v.B to some argument A is simple. The result is just B, but with any occurrences of v replaced with A instead. For example (x.(x (x y))(P Q)) reduces to ((P Q) ((P Q) y))---we replaced all the x's with (P Q)s. By convention, function application is taken to be left-associative, so that (P Q R) is short for ((P Q) R). From these few materials we can construct a programming system capable of performing arithmetic, constructing lists and trees, and expressing arbitrary recursive functions on these objects. 8-12 8-13 Theory of Computation Runnable Perl source code: Demonstration of recursive functions constructed with fixpoint operator. (Note: The punch line is at the end, so you might want to read it backwards.) http://perl.plover.com/lambda/jfp/lambda-brief.pl Demonstration of Church numeral arithmetic. (Note: Ditto.) http://perl.plover.com/lambda/jfp/lambda-church.pl If you're familiar with -calculus, but not with Perl, read the article that I submitted to the Journal of Functional Programming. PostScript version http://perl.plover.com/~mjd/perl/lambda/for-journal.ps If you're familiar with Perl, but not with the -calculus, read this other article instead. (Caution: Still in draft stage.) HTML version http://perl.plover.com/~mjd/perl/lambda/tpj.html I gave a talk about this on 18 November, 1999, to the Princeton chapters of the ACM and the IEEE Computer Society. The slides are available online. http://perl.plover.com/~mjd/perl/yak/lambda 8-13