Real numbers in type theory - Informatics Homepages Server

Real numbers in type theory Alex Simpson Laboratory for Foundations of Computer Science School of Informatics, University of Edinburgh (Based on joint work with Martı́n Escardó, University of Birmingham) 3-lecture course Days in Logic, Braga, 23–25 Jan 2014 Desiderata 1. Compute exactly with real numbers, avoiding rounding errors. 2. Program using implementation-independent primitives, so we don’t need to know how real numbers are represented. 3. Reason about programs on real numbers; e.g., prove equations. We address these by extending type theory with an abstract datatype of real numbers, with associated equational laws. This approach also addresses another desideratum: 4. Give a foundationally neutral mathematical definition of real number purely in terms of computationally meaningful primitives. Course structure Part 1: The intensional approach. Computing with representations of reals. Part 2: The extensional approach. An abstract datatype for the interval [−1, 1]. Part 3: A semi-extensional approach. A mantissa-exponent representation of R. Part 1: The intensional approach. Representing reals exactly Natural first attempt: represent a real number in [0, 1] exactly as a lazy stream providing infinite binary expansion. Let’s see how we might try to write a program computing the average of two real numbers in [0, 1] using this representation. 0. 0. 0. Representing reals exactly Natural first attempt: represent a real number in [0, 1] exactly as a lazy stream providing infinite binary expansion. Let’s see how we might try to write a program computing the average of two real numbers in [0, 1] using this representation. 0. 0. 0. ? Representing reals exactly Natural first attempt: represent a real number in [0, 1] exactly as a lazy stream providing infinite binary expansion. Let’s see how we might try to write a program computing the average of two real numbers in [0, 1] using this representation. 0. 0 0. 1 0. ? Representing reals exactly Natural first attempt: represent a real number in [0, 1] exactly as a lazy stream providing infinite binary expansion. Let’s see how we might try to write a program computing the average of two real numbers in [0, 1] using this representation. 0. 0 1 0. 1 0 0. ? Representing reals exactly Natural first attempt: represent a real number in [0, 1] exactly as a lazy stream providing infinite binary expansion. Let’s see how we might try to write a program computing the average of two real numbers in [0, 1] using this representation. 0. 0 1 0 0. 1 0 1 0. ? Representing reals exactly Natural first attempt: represent a real number in [0, 1] exactly as a lazy stream providing infinite binary expansion. Let’s see how we might try to write a program computing the average of two real numbers in [0, 1] using this representation. 0. 0 1 0 1 0. 1 0 1 0 0. ? Representing reals exactly Natural first attempt: represent a real number in [0, 1] exactly as a lazy stream providing infinite binary expansion. Let’s see how we might try to write a program computing the average of two real numbers in [0, 1] using this representation. 0. 0 1 0 1 0 0. 1 0 1 0 1 0. ? Representing reals exactly Natural first attempt: represent a real number in [0, 1] exactly as a lazy stream providing infinite binary expansion. Let’s see how we might try to write a program computing the average of two real numbers in [0, 1] using this representation. 0. 0 1 0 1 0 1 0. 1 0 1 0 1 0 0. ? Representing reals exactly Natural first attempt: represent a real number in [0, 1] exactly as a lazy stream providing infinite binary expansion. Let’s see how we might try to write a program computing the average of two real numbers in [0, 1] using this representation. 0. 0 1 0 1 0 1 0 0. 1 0 1 0 1 0 1 0. ? Representing reals exactly Natural first attempt: represent a real number in [0, 1] exactly as a lazy stream providing infinite binary expansion. Let’s see how we might try to write a program computing the average of two real numbers in [0, 1] using this representation. 0. 0 1 0 1 0 1 0 0. 1 0 1 0 1 0 1 0. ? ... ... Representing reals exactly Natural first attempt: represent a real number in [0, 1] exactly as a lazy stream providing infinite binary expansion. Let’s see how we might try to write a program computing the average of two real numbers in [0, 1] using this representation. 0. 0 1 0 1 0 1 0 0. 1 0 1 0 1 0 1 0. ? It is impossible to write such a program! ... ... Representing reals exactly Natural first attempt: represent a real number in [0, 1] exactly as a lazy stream providing infinite binary expansion. Let’s see how we might try to write a program computing the average of two real numbers in [0, 1] using this representation. 0. 0 1 0 1 0 1 0 0. 1 0 1 0 1 0 1 0. ? ... ... It is impossible to write such a program! Technically, there is no continuous function {0, 1}ω × {0, 1}ω → {0, 1}ω that computes average. Three implementations of [−1, 1] that do work Using streams: q0 : q1 : q2 : q3 : q4 : q5 : . . . 1. Fast Cauchy sequences Require qi to be rational numbers in [−1, 1] such that |qj − qi | ≤ 2−i , whenever j ≥ i, The stream represents the real number limi qi . 2. Dyadic digits Require qi to be dyadic rational numbers in [−1, 1]. The P stream represents the real number i≥0 2−(i+1) qi . 3. Signed binary This is the special case of dyadic digits in which we require qi ∈ {−1, 0, 1}. Three implementations of [−1, 1] that do work Using streams: q0 : q1 : q2 : q3 : q4 : q5 : . . . 1. Fast Cauchy sequences Require qi to be rational numbers in [−1, 1] such that |qj − qi | ≤ 2−i , whenever j ≥ i, The stream represents the real number limi qi . 2. Dyadic digits Require qi to be dyadic rational numbers in [−1, 1]. The P stream represents the real number i≥0 2−(i+1) qi . 3. Signed binary This is the special case of dyadic digits in which we require qi ∈ {−1, 0, 1}. These representations are interconvertible. Simple recursive programs using dyadic digits Average: avg(q : r , q 0 : r 0 ) = (q ⊕ q 0 ) : avg(r , r 0 ) where q ⊕ q 0 = q+q 0 2 . Multiplication: digit mult(q, q 0 : r 0 ) = (qq 0 ) : digit mult(q, r 0 ) mult(q : r , q 0 : r 0 ) = avg( (qq 0 ) : mult(r , r 0 ), avg(digit mult(q, r 0 ), digit mult(q 0 , r )) ) Converting dyadic digits to signed binary dy add(q, q 0 : r 0 ) =  0 0  1 : dy add((2q + q − 1), r ) (2q + q 0 ) : r 0   (−1) : dy add((2q + q 0 + 1), r 0 ) if 2q + q 0 ≥ 1 if −1 ≤ 2q + q 0 ≤ 1 if 2q + q 0 ≤ −1 convert(q0 : q1 : r ) =   1 : convert(dy add(q0 − 1, q1 : r )) 0 : convert(dy add(q0 , q1 : r ))   (−1) : convert(dy add(q0 + 1, q1 : r )) if q0 + q21 ≥ 12 if − 21 ≤ q0 + q21 ≤ if q0 + q21 ≤ − 21 1 2 Computing a useful infinite sum Consider a dyadic digit sequence as given by a function from natural numbers to dyadic rationals. The program M((q0 : q1 : r ) : (q 0 : r 0 ) : s) = 2q0 + q1 + q2 : (M(avg(r , r 0 ) : s)) 4 of type (dyadic digit stream) stream → dyadic digit stream computes the function X (xi )i≥0 7→ 2−(i+1) xi : [−1, 1]ω → [−1, 1] i≥0 Two issues I What programming capabilities do we need to write such programs? I Programming in the above style is done at a very nuts-and-bolts level. E.g., we need to take extreme care that programs behave equivalently on different intensional representations of the same real number; e.g., f (0 : 0 : 0 : 0 : 0 : . . . ) and f ((−1) : 1 : 1 : 1 : 1 : . . . ) should both result in representations of the same real number. Two issues I What programming capabilities do we need to write such programs? We shall see that such programs can be written in the minimal type-theoretic framework of typed λ-calculus with natural numbers (Gödel’s System T). I Programming in the above style is done at a very nuts-and-bolts level. E.g., we need to take extreme care that programs behave equivalently on different intensional representations of the same real number; e.g., f (0 : 0 : 0 : 0 : 0 : . . . ) and f ((−1) : 1 : 1 : 1 : 1 : . . . ) should both result in representations of the same real number. Two issues I What programming capabilities do we need to write such programs? We shall see that such programs can be written in the minimal type-theoretic framework of typed λ-calculus with natural numbers (Gödel’s System T). I Programming in the above style is done at a very nuts-and-bolts level. E.g., we need to take extreme care that programs behave equivalently on different intensional representations of the same real number; e.g., f (0 : 0 : 0 : 0 : 0 : . . . ) and f ((−1) : 1 : 1 : 1 : 1 : . . . ) should both result in representations of the same real number. A major goal of this course is to introduce an abstract datatype for real numbers, using which such extensional behaviour is guaranteed. Gödel’s System T Types: σ ::= N | σ → τ Terms: I Typed variables: x : σ (where x is a variable of type σ) I Abstractions: if t : τ then (λx : σ. t) : σ → τ I Applications: if t : σ → τ and u : σ then t u : τ . Constants: 0 : N s : N→N rec : σ → (σ → N → σ) → N → σ Computation Single-step reduction (applicable arbitrarily deep inside terms): (λx. t) u −→ t[u/x] rec u v 0 −→ u rec u v (s(t)) −→ v (rec u v t) t Theorem (Tait 1960s). Reduction in System T is strongly normalizing. Reduction is also confluent (Church-Rosser). It follows that every closed term of type N reduces to a unique numeral (a term of the form sk (0)). Strong normalization for System T implies (and is in fact equivalent to) the 1-consistency of Peano Arithmetic. Set-theoretic semantics Semantics of types: [[N]] = N [[σ → τ ]] = [[σ]] → [[τ ]] Every closed term t : τ determines an element [[t]] ∈ [[τ ]] (The semantics routinely generalises to any cartesian-closed category (CCC) with natural numbers object (NNO).) Set-theoretic semantics Semantics of types: [[N]] = N [[σ → τ ]] = [[σ]] → [[τ ]] Every closed term t : τ determines an element [[t]] ∈ [[τ ]] (The semantics routinely generalises to any cartesian-closed category (CCC) with natural numbers object (NNO).) Say that a function f : Nk → N is T-representable if there exists a closed term t : Nk → N such that, for all n1 , . . . , nk ∈ N, t n1 . . . nk −→∗ f (n1 , . . . , nk ) Theorem (essentially Gödel). A function is T-representable if and only if it is provably total in Peano Arithmetic. Equalities (valid in any CCC with NNO) I If t −→ u then t = u. I If t : σ → τ then t = λx : σ. t x. I If w : σ → (σ → N → σ) → N → σ and u : σ and v : σ → N → σ and w u v 0 = u and w u v (s(x)) = v (w u v x) x then w u v = rec u v . Equalities (valid in any CCC with NNO) I If t −→ u then t = u. I If t : σ → τ then t = λx : σ. t x. I If w : σ → (σ → N → σ) → N → σ and u : σ and v : σ → N → σ and w u v 0 = u and w u v (s(x)) = v (w u v x) x then w u v = rec u v . Exercises on System T. 1. Define the following functions of type N → N → N: addition, multiplication, Ackermann’s function. 2. Prove commutativity and associativity for addition and multiplication. Prove distributivity of multiplication over addition. Encoding real numbers in System T We can encode any of our stream representations of [−1, 1] using the type N → N. E.g., in the case of signed binary, we can consider a function α : N → N as encoding the signed binary stream ((α(0) mod 3) − 1) : ((α(1) mod 3) − 1) : ((α(2) mod 3) − 1) : . . . The different representations are all intertranslatable in System T. Moreover, all the programs we gave for computing functions on real numbers using the signed binary and dyadic digits representation are routinely translatable into System T terms of appropriate type. Encoding functions on reals in System T For convenience we fix on signed binary representation. Consider the function real : (N → N) → [−1, 1] defined by X real(α) = 2−(i+1) ((α(i) mod 3) − 1) i≥0 We say that a function f : [−1, 1]k → [−1, 1] is T-representable if there exists a closed term t : (N → N)k → N → N making the following diagram commute (N → N)k [[t]] - realk (N → N) real ? [−1, 1]k f ? - [−1, 1] Comments I All T-representable functions are continuous. I Very many continuous functions are T-representable. I But programming functions on reals in System T has to be done in the nuts-and-bolts style mentioned earlier. Comments I All T-representable functions are continuous. I Very many continuous functions are T-representable. I But programming functions on reals in System T has to be done in the nuts-and-bolts style mentioned earlier. Address last point by introducing an abstract datatype for a closed interval of real numbers. Programs on real numbers can then be written in an automatically extensional way using the interface for the abstract datatype. Part 2: The extensional approach. System I We extend System T with a new type: σ ::= N | σ → τ | I I is abstract datatype for the closed interval [−1, 1] of real numbers. Semantics: [[ I ]] = [−1, 1] The interface for the type will roughly implement the idea that the closed interval is determined as the free convex set on 2 generators (−1, 1) with respect to affine maps. Since the very formulation of convexity requires a pre-existing interval, we replace convexity with the existence of iterated midpoints and we replace affineness with preservation of (iterated) midpoints. [Escardó & S., LICS 2001] Interface and semantics New constants and their meanings 1 : I −1 : I m : I→I→I M : (N → I) → I aff : I → I → I → I [[1]] = 1 [[−1]] = − 1 x +y [[m]] x y = 2 [[M]] f = [[aff]] x y z = X f (i) 2(i+1) i≥0 (1 − z) x + (1 + z) y 2 Computation I is an abstract datatype. We don’t specify a particular concrete implemention, but it is vital that implementation is possible. We have seen three possible implementations of the type I using the type N → N in System T. In System I, programming on real numbers is performed purely using the interface, irrespective of implementation. The program stays the same even if the implementation changes (e.g., to improve efficiency). Programs on reals are automatically extensional. Some simple programming x ⊕ y = m x y = M(x, y , y , y , y , y , y , . . . ) (So not actually necessary to include m as a primitive, since derivable from M. Nonetheless included for convenience.) 0 = (−1) ⊕ 1 −x = aff 1 (−1) x xy = aff (−x) x y 1 = M(1, −1, 1, −1, 1, −1, 1, −1, . . . ) 3 More generally, any rational number is definable using M applied to an eventually periodic sequence of 1s and −1s. Still more generally, any real number with a T-definable binary expansion is definable. Equalities mt t = t mt u =mu t m (m t u) (m v w ) = m (m t v ) (m u w ) M ti = m(t0 , M ti+1 ) i i λi. ti = λi. m(ui , ti+1 ) aff t u (−1) = t =⇒ t0 = M ui i aff t u 1 = u aff t u m(v , w ) = m(aff t u v , aff t u w ) For w : I → I → I, if w t u (−1) = t and w t u 1 = u and w t u m(x, y ) = m(w t u x, w t u y ) then w t u = aff t u Consequences (exercise!) −−x = x x ⊕ −x = 0 x0 = 0 xy = yx (x y ) z = x (y z) x (−y ) = −(x y ) x (y ⊕ z) = (x y ) ⊕ (x z) More programming, medial power series Suppose X f (x) = an x n n≥0 where an ∈ [−1, 1]. Then x n 1X 1 x f = an 2 2 2 2 n≥0 X xn = an 2n+1 n≥0 = M an x n n So, for example: 1 2−x x 1 exp 2 2 1 x sin 2 2 1 x ln 1 + 2 2 r 1 x 1+ 2 2 = M xn n = M n xn n! n n−1 x = M parity(n)(−1)b 2 c n n! = M (−1)n n = M n x n+1 (n + 1) (−1)n (2n)! xn (1 − 2n)(n!)2 4n All these define total continuous functions [−1, 1] → [−1, 1]. I-definable functions A function f : [−1, 1]k → [−1, 1] is said to be I-definable if there exists a closed System I term t : Ik → I such that [[t]] x1 . . . xk = f (x1 , . . . , xk ) Any I-definable function is automatically continuous, indeed computable. Proposition. If f is I-definable then it is T-representable. A non-definable function   if 21 ≤ x 1 dbl(x) = 2x if − 12 ≤ x ≤   −1 if x ≤ − 21 Note that dbl is continuous though not smooth. Also dbl is computable, in fact T-representable. 1 2 A non-definable function   if 21 ≤ x 1 dbl(x) = 2x if − 12 ≤ x ≤   −1 if x ≤ − 21 Note that dbl is continuous though not smooth. Also dbl is computable, in fact T-representable. Proposition. The function dbl is not definable. 1 2 A non-definable function   if 21 ≤ x 1 dbl(x) = 2x if − 12 ≤ x ≤   −1 if x ≤ − 21 Note that dbl is continuous though not smooth. Also dbl is computable, in fact T-representable. Proposition. The function dbl is not definable. Will prove this later. 1 2 System II Extend System I with a new constant dbl : I → I denoting the function dbl : [−1, 1] → [−1, 1] New equations: dbl m(1, m(1, x)) = 1 dbl m(0, x) = x dbl m(−1, m(−1, x)) = −1 One consequence is cancellation: m(t, u) = m(t, v ) =⇒ u = v Programming in System II [x + y ] = dbl(x ⊕ y ) x y = x ⊕ (−y ) [x − y ] = dbl(x y ) max(0, x) = [[x − 1] + 1] max(x, y ) = dbl x2 + max (0, y x) min(x, y ) = − max(−x, −y ) |x| = max(−x, x) Question. Are any of the last 4 functions definable in System I? A limit-finding function in System II Define fastlim : (N → I) → I by: fastlimi xi = dbl M dbln+1 (xn+1 xn ) n Lemma. If |xn+1 − xn | ≤ 2−n then fastlimi xi is the limit of the (fast) Cauchy sequence (xn )n . Note that fastlimi xi always returns a value, even if (xn )n is non-convergent. Also if (xn )n converges slowly, fastlimi xi need not be the limit. II-definable functions A function f : [−1, 1]k → [−1, 1] is said to be II-definable if there exists a closed System II term t : Ik → I such that [[t]] x1 . . . xk = f (x1 , . . . , xk ) Proposition. If f is II-definable then it is T-representable. II-definable functions A function f : [−1, 1]k → [−1, 1] is said to be II-definable if there exists a closed System II term t : Ik → I such that [[t]] x1 . . . xk = f (x1 , . . . , xk ) Proposition. If f is II-definable then it is T-representable. Theorem. If f is T-representable then it is II-definable. Outline of proof 1. Use the T-representation t of f to construct a Cauchy sequence of polygon approximations to f in System II. 2. Extract a modulus of uniform continuity for t on the compact subset (N → {0, 1, 1})k . 3. Use the modulus of uniform continuity to obtain a fast Cauchy sequence of polygon approximations. 4. A term defining f is then obtained using fastlim. A gluing function Define glue : (I → I)2 → I → I by 1 glue f g x = dbl dbl f dbl x + 12 ⊕ g dbl x − 12 2 f (1) Then glue represents the function: glue f g x = [f (dbl x + 12 ) + g (dbl x − 12 )) − f (1)] which, whenever f (1) = g (−1), satisfies ( f (2x + 1) if −1 ≤ x ≤ 0 glue f g x = g (2x − 1) if 0 ≤ x ≤ 1 Polygon approximations For every k ≥ 1, we define a System II term: apprk : N → ((N → N)k → I) → (Ik → I) The base case appr1 is defined by recursion on N to satisfy: appr1 0 h = aff (h(−1)) (h(1)) appr1 (n + 1) h = glue (appr1 n (λx. h(avg(x, −1)))) (appr1 n (λx. h(avg(x, 1)))) Given apprk , the term apprk+1 is given by apprk+1 n h x0 x1 . . . xk = appr1 n (λy0 . apprk n (h y0 ) x1 . . . xk ) x0 The application apprk n h produces a piecewise multilinear approximation to the function h, with the argument types changed from N → N to I. More precisely, the apprk n function uses k-tuples of values from the set Qn := {qni | 0 ≤ i ≤ 2n } where qni := i −1 2n−1 to form a lattice of (2n + 1)k rational partition points in [−1, 1]k . The application apprk n h then results in a function Ik → I that agrees with h at the partition points, and is (separately) affine in each coordinate between partition points. It is also affine in the h argument. The lemma on the next slide formalises this. Lemma. If h : (N → N)k → I represents f : [−1, 1]k → [−1, 1] then: 1. For all r0 , . . . , rk−1 ∈ Qn we have: apprk n h r0 . . . rk−1 = f r0 . . . rk−1 2. For 0 ≤ j < k, 0 ≤ i < 2n , and 0 ≤ λ ≤ 1 xj+1 . . . xk−1 = apprk n h x0 . . . xj−1 2i+λ n−1 − 1 (1 − λ) apprk n h x0 . . . xj−1 qni xj+1 . . . xk−1 + λ apprk n h x0 . . . xj−1 qni+1 xj+1 . . . xk−1 i i+1 ). Note that 2i+λ n−1 − 1 = ((1 − λ) qn + λ qn Also, if h1 , h2 : (N → N)k → I are real-respecting then 3. For 0 ≤ λ ≤ 1, we have: apprk n ((1 − λ)h1 + λh2 ) x0 . . . xk−1 = (1 − λ) apprk n h1 x0 . . . xk−1 + λ apprk n h2 x0 . . . xk−1 Continuity in System T Given a closed System T term t : (N → N) → N → N it holds that [[t]] : (N → N) → (N → N) is continuous with respect to the product topology on N → N (Baire space). In detail, this means that for any α : N → N, we have: For all e ≥ 0, there exists d ≥ 0 such that, for all β : N → N with β d = α d , it holds that [[t]](α) e = [[t]](β) e (We write α n for the n-tuple (α(0), . . . , α(n − 1)) ∈ Nn .) Uniform continuity in System T Since N → {0, 1, 2} ⊆ N → N is compact, it follows that [[t]] is uniformly continuous on this subspace; i.e.,: For all e ≥ 0, there exists d ≥ 0 such that, for all α, β : N → {0, 1, 2} with α d = β d , it holds that [[t]](α) e = [[t]](β) e A function µ : N → N is a modulus of uniform continuity if: For all e ≥ 0, and all α, β : N → {0, 1, 2} with α µ(e) = β µ(e) , it holds that [[t]](α) e = [[t]](β) e Proposition. There exists a closed System T term Gt : N → N that defines a modulus of uniform continuity for t. An immediate consequence Suppose t : (N → N)k → N → N represents f : [−1, 1]k → [−1, 1] (using our System T encoding of signed binary). Then there exists Gt : N → N such that, for all e ≥ 0, and for all α1 , . . . , αk , β1 , . . . , βk : N → {0, 1, 2} with αi Gt (d) = βi Gt (d) , it holds that |f (real(α1 ), . . . , real(αk )) − f (real(β1 ), . . . , real(βk ))| ≤ 2−e Completing proof of theorem Suppose that t : (N → N)k → N → N is a closed term that T-represents f : [−1, 1]k → [−1, 1]. Let Gt be a uniform modulus for continuity for t on N → {0, 1, 2}, as given on the previous slide. Let gn : [−1, 1]k → [−1, 1] be defined by: apprk (Gt (n + 1)) (real ◦ t) : Ik → I Then for all x1 , . . . , xk ∈ [−1, 1], |f (x1 , . . . , xn ) − gn (x1 , . . . , xn )| ≤ 2−n Therefore the term below I-defines f . λ x0 . . . xk−1 . fastlim (λn. apprk (Gt (n + 1)) (real ◦ t) x0 . . . xk−1 ) Summary of what we have done so far We have considered two approaches to computation with real numbers in the interval [−1, 1]. 1. Intensional approach. Encode a real as a stream/function. Elaborated in the context of System T. 2. Extensional approach. Encapsulate real numbers in an abstract datatype. Elaborated in the context of Systems I and II. Main result: both approaches equivalent in power. Summary of what we have done so far We have considered two approaches to computation with real numbers in the interval [−1, 1]. 1. Intensional approach. Encode a real as a stream/function. Elaborated in the context of System T. 2. Extensional approach. Encapsulate real numbers in an abstract datatype. Elaborated in the context of Systems I and II. Main result: both approaches equivalent in power. Pressing question: What about the full real line? Part 3: A semi-extensional approach. Mantissa-exponent representation of real line We use System I (no need for II !). For convenience we include product types σ × τ . We use N to encode integers including negative integers. We write Z when doing this, and use evident related notation. Represent arbitrary real numbers using the type I × Z. (x, z) represents the real number 2z x. This is a semi-extensional representation of real numbers, combining: a discrete scaling z, which is intensional; and a continuous value x, which is extensional. (An alternative choice would be to use (x, z) to represent z + x. But this doesn’t work, not even if we use System II !) Equivalence between reals We reason in a setting in which we have intuitionistic first-order logic over System I extending the equational logic we have considered so far. (Actually would be better to reformulate what follows, as far as possible, to use only equational logic.) Equality between reals is the smallest equivalence relation on I × Z satisfying x , m+1 (x, m) ∼ 2 This is can be defined equationally by: (x, m) ∼ (y , n) ⇐⇒ x 2max(m,n)−m N.B., we need to assume the cancellation law x ⊕ y = x ⊕ z =⇒ y = z = y 2max(m,n)−n Basic arithmetic 0 = (0, 0) 1 = (1, 0) −(x, m) = (−x, m) x y (x, m) + (y , n) = ⊕ , max(m, n) + 1 2max(m,n)−m 2max(m,n)−n (x, m) × (y , n) = (x y , m + n) Can also define any rational number. So can define polynomials with rational coefficients. Basic equations The operations −, +, × respect ∼. The expected equations are all provable, e.g.: 0+x ∼ x x+y ∼ y+x (x + y) + z ∼ x + (y + z) 0×x ∼ 0 1×x ∼ 1 x×y ∼ y×x (x × y) × z ∼ x × (y × z) x × (y + z) ∼ (x × y) + (x × z) ... Basic definitions Define |(x, m)| ≤ 2n ⇐⇒ ∃y . (x, m) ∼ (y , n) (N.B., |x| ≤ 2n is defined as a primitive relation between x : I × Z and n : N. Is it possible to define | · | : I × Z → I × Z in System I?) Lemma. The following are equivalent 1. For all integers n, |(x, m)| ≤ 2n 2. (x, m) ∼ 0 3. x = 0 Define 0 ≤ (x, m) ⇐⇒ ∃y . x = 1 ⊕ y x ≤ y ⇐⇒ 0 ≤ y − x x < y ⇐⇒ ∃q ∈ Q. 0 < q and x + q ≤ y Limits of fast Cauchy sequences Suppose we have a sequence (xi )i given by: x(−) : N → I × Z such that the inequalities |xi+1 − xi | ≤ 2−(i+1) are witnessed by d(−) : N → I satisfying xi+1 − xi ∼ (di , −(i + 1)) Then we define lim xi = x0 + (M di , 0) i i Directions for investigation Question. Is every System T representable function : Rk → R representable in System I by a term (I × Z)k → I × Z ? Given that we can define rational polynomials and Cauchy limits, it is not implausible that a positive answer to the question might be obtained following a suitable constructive proof of the Stone-Weierstrass theorem. Question. If we add I as an abstract datatype to dependent type theory, does this support the development of constructive analysis, based on the semi-extensional real numbers defined via mantissa-exponent? Definability in System I revisited Given that System I now seems more interesting than Part 2 suggested, we end by returning to questions of definability and undefinability in System I. We address the following. I All system I functions we have defined so far have been smooth. Are all I-definable functions smooth? I How do we prove that dbl is not I-definable? And what does this tell us about I-definable functions more generally? Non-smooth I-definable functions times∗ (x, y ) = aff (−1) x y sq∗ (x) = times∗ (x, x) ∗ ∗ ∗ 7 , sq (−sq (−x)) g(x) = times 9 h(x) = M g3(i+1) (x) i H(x) = M (g3(i+1) (x))2 i N.B., g defines the function g(x) = 1 4 4 3 2 2 4 x − x − x + x 9 9 9 3 which satisfies: g(0) = 0 g0 (0) = 4 3 I h has derivative +∞ at 0. I H at 0 has derivative −∞ from below and +∞ from above. Graph of H (thanks to Yitong Li): Undefinability of dbl Proposition. The function dbl is not I-definable. We prove this using logical relations For every type τ we define a binary relation ∆τ ⊆ [[τ ]] × [[τ ]] by: ∆N (m, n) ⇐⇒ m = n ∆I (x, y ) ⇐⇒ if x ∈ {−1, 1} or y ∈ {−1, 1} then x = y ∆σ→τ (f , g ) ⇐⇒ ∀x, y ∈ [[σ]]. ∆σ (x, y ) implies ∆τ (f (x), g (y )) Lemma. For every System I constant c : τ , it holds that ∆τ ([[c]], [[c]]). E.g., to show ∆I→I→I (m, m). Suppose ∆I (x, x 0 ) and ∆I (y , y 0 ). (?) Need to show ∆I (x ⊕ y , x 0 ⊕ y 0 ). Suppose x ⊕ y ∈ {−1, 1}. W.l.o.g., assume it is 1. Then x = y = 1. So x 0 = 1 and y 0 = 1, by (?). Thus indeed x 0 ⊕ y 0 = 1 = x ⊕ y . And similarly for the case when x 0 ⊕ y 0 ∈ {−1, 1}. Lemma. For every System I constant c : τ , it holds that ∆τ ([[c]], [[c]]). E.g., to show ∆I→I→I→I (aff, aff). Suppose ∆I (x, x 0 ) and ∆I (y , y 0 ) and ∆I (z, z 0 ). y (1−z 0 ) x 0 +(1+z 0 ) y 0 Need to show ∆I (1−z) x+(1+z) , 2 2 Suppose (1−z) x+(1+z) y 2 (?) ∈ {−1, 1}. W.l.o.g., assume it is 1. There are three possibilities: (i) x = 1 and z = −1; (ii) y = z = 1; (iii) x = y = 1. In each case, the corresponding equations hold for x 0 , y 0 , z 0 , by (?). Thus indeed (1−z 0 ) x 0 +(1+z 0 ) y 0 2 =1= And similarly for the case when x 0 ⊕ (1−z) x+(1+z) y . 2 0 y ∈ {−1, 1}. Using the previous lemma, the fundamental lemma of logical relations gives us Lemma. For every closed System I term t : τ , ∆τ ([[t]], [[t]]). (The direct proof is by induction on term structure.) But ∆I→I (dbl, dbl) does not hold. Indeed we have ∆I 0, 12 but not ∆I dbl(0), dbl (Since dbl(0) = 0 and dbl 12 = 1 ) Therefore dbl is not I-definable. 1 2 Using the previous lemma, the fundamental lemma of logical relations gives us Lemma. For every closed System I term t : τ , ∆τ ([[t]], [[t]]). (The direct proof is by induction on term structure.) But ∆I→I (dbl, dbl) does not hold. Indeed we have ∆I 0, 12 but not ∆I dbl(0), dbl (Since dbl(0) = 0 and dbl 12 = 1 ) 1 2 Therefore dbl is not I-definable. More generally, for any I-definable f : [−1, 1] → [−1, 1], if f (x) ∈ {−1, 1} for some x ∈ (0, 1), then f is a constant function. Course summary 1. Gödel’s System T serves as a minimal functional programming language for implementing exact computation on real numbers via intensional representations. 2. Systems I and II implement an abstract datatype for a closed interval, offering an extensional approach to exact programming on the interval. System II is equivalent in power to the intensional approach. 3. A mantissa-exponent representation offers a semi-extensional approach to computing on the full real line. Apparently, the dbl function is not needed for this, so we can use System I.

Real numbers in type theory - Informatics Homepages Server

Related documents

Products

Support

Real numbers in type theory - Informatics Homepages Server

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib