Real numbers in type theory - Informatics Homepages Server

advertisement
Real numbers in type theory
Alex Simpson
Laboratory for Foundations of Computer Science
School of Informatics, University of Edinburgh
(Based on joint work with Martı́n Escardó, University of Birmingham)
3-lecture course
Days in Logic, Braga, 23–25 Jan 2014
Desiderata
1. Compute exactly with real numbers, avoiding rounding errors.
2. Program using implementation-independent primitives, so we
don’t need to know how real numbers are represented.
3. Reason about programs on real numbers; e.g., prove
equations.
We address these by extending type theory with an abstract
datatype of real numbers, with associated equational laws.
This approach also addresses another desideratum:
4. Give a foundationally neutral mathematical definition of real
number purely in terms of computationally meaningful
primitives.
Course structure
Part 1: The intensional approach.
Computing with representations of reals.
Part 2: The extensional approach.
An abstract datatype for the interval [−1, 1].
Part 3: A semi-extensional approach.
A mantissa-exponent representation of R.
Part 1: The intensional approach.
Representing reals exactly
Natural first attempt: represent a real number in [0, 1] exactly as a
lazy stream providing infinite binary expansion.
Let’s see how we might try to write a program computing the
average of two real numbers in [0, 1] using this representation.
0.
0.
0.
Representing reals exactly
Natural first attempt: represent a real number in [0, 1] exactly as a
lazy stream providing infinite binary expansion.
Let’s see how we might try to write a program computing the
average of two real numbers in [0, 1] using this representation.
0.
0.
0. ?
Representing reals exactly
Natural first attempt: represent a real number in [0, 1] exactly as a
lazy stream providing infinite binary expansion.
Let’s see how we might try to write a program computing the
average of two real numbers in [0, 1] using this representation.
0. 0
0. 1
0. ?
Representing reals exactly
Natural first attempt: represent a real number in [0, 1] exactly as a
lazy stream providing infinite binary expansion.
Let’s see how we might try to write a program computing the
average of two real numbers in [0, 1] using this representation.
0. 0 1
0. 1 0
0. ?
Representing reals exactly
Natural first attempt: represent a real number in [0, 1] exactly as a
lazy stream providing infinite binary expansion.
Let’s see how we might try to write a program computing the
average of two real numbers in [0, 1] using this representation.
0. 0 1 0
0. 1 0 1
0. ?
Representing reals exactly
Natural first attempt: represent a real number in [0, 1] exactly as a
lazy stream providing infinite binary expansion.
Let’s see how we might try to write a program computing the
average of two real numbers in [0, 1] using this representation.
0. 0 1 0 1
0. 1 0 1 0
0. ?
Representing reals exactly
Natural first attempt: represent a real number in [0, 1] exactly as a
lazy stream providing infinite binary expansion.
Let’s see how we might try to write a program computing the
average of two real numbers in [0, 1] using this representation.
0. 0 1 0 1 0
0. 1 0 1 0 1
0. ?
Representing reals exactly
Natural first attempt: represent a real number in [0, 1] exactly as a
lazy stream providing infinite binary expansion.
Let’s see how we might try to write a program computing the
average of two real numbers in [0, 1] using this representation.
0. 0 1 0 1 0 1
0. 1 0 1 0 1 0
0. ?
Representing reals exactly
Natural first attempt: represent a real number in [0, 1] exactly as a
lazy stream providing infinite binary expansion.
Let’s see how we might try to write a program computing the
average of two real numbers in [0, 1] using this representation.
0. 0 1 0 1 0 1 0
0. 1 0 1 0 1 0 1
0. ?
Representing reals exactly
Natural first attempt: represent a real number in [0, 1] exactly as a
lazy stream providing infinite binary expansion.
Let’s see how we might try to write a program computing the
average of two real numbers in [0, 1] using this representation.
0. 0 1 0 1 0 1 0
0. 1 0 1 0 1 0 1
0. ?
...
...
Representing reals exactly
Natural first attempt: represent a real number in [0, 1] exactly as a
lazy stream providing infinite binary expansion.
Let’s see how we might try to write a program computing the
average of two real numbers in [0, 1] using this representation.
0. 0 1 0 1 0 1 0
0. 1 0 1 0 1 0 1
0. ?
It is impossible to write such a program!
...
...
Representing reals exactly
Natural first attempt: represent a real number in [0, 1] exactly as a
lazy stream providing infinite binary expansion.
Let’s see how we might try to write a program computing the
average of two real numbers in [0, 1] using this representation.
0. 0 1 0 1 0 1 0
0. 1 0 1 0 1 0 1
0. ?
...
...
It is impossible to write such a program!
Technically, there is no continuous function
{0, 1}ω × {0, 1}ω → {0, 1}ω that computes average.
Three implementations of [−1, 1] that do work
Using streams:
q0 : q1 : q2 : q3 : q4 : q5 : . . .
1. Fast Cauchy sequences
Require qi to be rational numbers in [−1, 1] such that
|qj − qi | ≤ 2−i , whenever j ≥ i, The stream represents the
real number limi qi .
2. Dyadic digits
Require qi to be dyadic rational numbers
in [−1, 1]. The
P
stream represents the real number i≥0 2−(i+1) qi .
3. Signed binary
This is the special case of dyadic digits in which we require
qi ∈ {−1, 0, 1}.
Three implementations of [−1, 1] that do work
Using streams:
q0 : q1 : q2 : q3 : q4 : q5 : . . .
1. Fast Cauchy sequences
Require qi to be rational numbers in [−1, 1] such that
|qj − qi | ≤ 2−i , whenever j ≥ i, The stream represents the
real number limi qi .
2. Dyadic digits
Require qi to be dyadic rational numbers
in [−1, 1]. The
P
stream represents the real number i≥0 2−(i+1) qi .
3. Signed binary
This is the special case of dyadic digits in which we require
qi ∈ {−1, 0, 1}.
These representations are interconvertible.
Simple recursive programs using dyadic digits
Average:
avg(q : r , q 0 : r 0 ) = (q ⊕ q 0 ) : avg(r , r 0 )
where q ⊕ q 0 =
q+q 0
2 .
Multiplication:
digit mult(q, q 0 : r 0 ) = (qq 0 ) : digit mult(q, r 0 )
mult(q : r , q 0 : r 0 ) = avg( (qq 0 ) : mult(r , r 0 ),
avg(digit mult(q, r 0 ),
digit mult(q 0 , r )) )
Converting dyadic digits to signed binary
dy add(q, q 0 : r 0 ) =

0
0

1 : dy add((2q + q − 1), r )
(2q + q 0 ) : r 0


(−1) : dy add((2q + q 0 + 1), r 0 )
if 2q + q 0 ≥ 1
if −1 ≤ 2q + q 0 ≤ 1
if 2q + q 0 ≤ −1
convert(q0 : q1 : r ) =


1 : convert(dy add(q0 − 1, q1 : r ))
0 : convert(dy add(q0 , q1 : r ))


(−1) : convert(dy add(q0 + 1, q1 : r ))
if q0 + q21 ≥ 12
if − 21 ≤ q0 + q21 ≤
if q0 + q21 ≤ − 21
1
2
Computing a useful infinite sum
Consider a dyadic digit sequence as given by a function from
natural numbers to dyadic rationals.
The program
M((q0 : q1 : r ) : (q 0 : r 0 ) : s) =
2q0 + q1 + q2
: (M(avg(r , r 0 ) : s))
4
of type
(dyadic digit stream) stream → dyadic digit stream
computes the function
X
(xi )i≥0 7→
2−(i+1) xi : [−1, 1]ω → [−1, 1]
i≥0
Two issues
I
What programming capabilities do we need to write such
programs?
I
Programming in the above style is done at a very
nuts-and-bolts level. E.g., we need to take extreme care that
programs behave equivalently on different intensional
representations of the same real number; e.g.,
f (0 : 0 : 0 : 0 : 0 : . . . ) and f ((−1) : 1 : 1 : 1 : 1 : . . . ) should
both result in representations of the same real number.
Two issues
I
What programming capabilities do we need to write such
programs?
We shall see that such programs can be written in the
minimal type-theoretic framework of typed λ-calculus with
natural numbers (Gödel’s System T).
I
Programming in the above style is done at a very
nuts-and-bolts level. E.g., we need to take extreme care that
programs behave equivalently on different intensional
representations of the same real number; e.g.,
f (0 : 0 : 0 : 0 : 0 : . . . ) and f ((−1) : 1 : 1 : 1 : 1 : . . . ) should
both result in representations of the same real number.
Two issues
I
What programming capabilities do we need to write such
programs?
We shall see that such programs can be written in the
minimal type-theoretic framework of typed λ-calculus with
natural numbers (Gödel’s System T).
I
Programming in the above style is done at a very
nuts-and-bolts level. E.g., we need to take extreme care that
programs behave equivalently on different intensional
representations of the same real number; e.g.,
f (0 : 0 : 0 : 0 : 0 : . . . ) and f ((−1) : 1 : 1 : 1 : 1 : . . . ) should
both result in representations of the same real number.
A major goal of this course is to introduce an abstract
datatype for real numbers, using which such extensional
behaviour is guaranteed.
Gödel’s System T
Types:
σ ::= N | σ → τ
Terms:
I
Typed variables: x : σ (where x is a variable of type σ)
I
Abstractions: if t : τ then (λx : σ. t) : σ → τ
I
Applications: if t : σ → τ and u : σ then t u : τ .
Constants:
0 : N
s : N→N
rec : σ → (σ → N → σ) → N → σ
Computation
Single-step reduction (applicable arbitrarily deep inside terms):
(λx. t) u −→ t[u/x]
rec u v 0 −→ u
rec u v (s(t)) −→ v (rec u v t) t
Theorem (Tait 1960s). Reduction in System T is strongly
normalizing.
Reduction is also confluent (Church-Rosser). It follows that every
closed term of type N reduces to a unique numeral (a term of the
form sk (0)).
Strong normalization for System T implies (and is in fact
equivalent to) the 1-consistency of Peano Arithmetic.
Set-theoretic semantics
Semantics of types:
[[N]] = N
[[σ → τ ]] = [[σ]] → [[τ ]]
Every closed term t : τ determines an element [[t]] ∈ [[τ ]] (The
semantics routinely generalises to any cartesian-closed category
(CCC) with natural numbers object (NNO).)
Set-theoretic semantics
Semantics of types:
[[N]] = N
[[σ → τ ]] = [[σ]] → [[τ ]]
Every closed term t : τ determines an element [[t]] ∈ [[τ ]] (The
semantics routinely generalises to any cartesian-closed category
(CCC) with natural numbers object (NNO).)
Say that a function f : Nk → N is T-representable if there exists a
closed term t : Nk → N such that, for all n1 , . . . , nk ∈ N,
t n1 . . . nk −→∗ f (n1 , . . . , nk )
Theorem (essentially Gödel). A function is T-representable if and
only if it is provably total in Peano Arithmetic.
Equalities (valid in any CCC with NNO)
I
If t −→ u then t = u.
I
If t : σ → τ then t = λx : σ. t x.
I
If w : σ → (σ → N → σ) → N → σ and u : σ and
v : σ → N → σ and w u v 0 = u and
w u v (s(x)) = v (w u v x) x then w u v = rec u v .
Equalities (valid in any CCC with NNO)
I
If t −→ u then t = u.
I
If t : σ → τ then t = λx : σ. t x.
I
If w : σ → (σ → N → σ) → N → σ and u : σ and
v : σ → N → σ and w u v 0 = u and
w u v (s(x)) = v (w u v x) x then w u v = rec u v .
Exercises on System T.
1. Define the following functions of type N → N → N: addition,
multiplication, Ackermann’s function.
2. Prove commutativity and associativity for addition and
multiplication. Prove distributivity of multiplication over
addition.
Encoding real numbers in System T
We can encode any of our stream representations of [−1, 1] using
the type N → N.
E.g., in the case of signed binary, we can consider a function
α : N → N as encoding the signed binary stream
((α(0) mod 3) − 1) : ((α(1) mod 3) − 1) : ((α(2) mod 3) − 1) : . . .
The different representations are all intertranslatable in System T.
Moreover, all the programs we gave for computing functions on real
numbers using the signed binary and dyadic digits representation
are routinely translatable into System T terms of appropriate type.
Encoding functions on reals in System T
For convenience we fix on signed binary representation. Consider
the function real : (N → N) → [−1, 1] defined by
X
real(α) =
2−(i+1) ((α(i) mod 3) − 1)
i≥0
We say that a function f : [−1, 1]k → [−1, 1] is T-representable if
there exists a closed term t : (N → N)k → N → N making the
following diagram commute
(N → N)k
[[t]] -
realk
(N → N)
real
?
[−1, 1]k
f
?
- [−1, 1]
Comments
I
All T-representable functions are continuous.
I
Very many continuous functions are T-representable.
I
But programming functions on reals in System T has to be
done in the nuts-and-bolts style mentioned earlier.
Comments
I
All T-representable functions are continuous.
I
Very many continuous functions are T-representable.
I
But programming functions on reals in System T has to be
done in the nuts-and-bolts style mentioned earlier.
Address last point by introducing an abstract datatype for a closed
interval of real numbers.
Programs on real numbers can then be written in an automatically
extensional way using the interface for the abstract datatype.
Part 2: The extensional approach.
System I
We extend System T with a new type:
σ ::= N | σ → τ | I
I is abstract datatype for the closed interval [−1, 1] of real
numbers. Semantics:
[[ I ]] = [−1, 1]
The interface for the type will roughly implement the idea that the
closed interval is determined as the free convex set on 2 generators
(−1, 1) with respect to affine maps.
Since the very formulation of convexity requires a pre-existing
interval, we replace convexity with the existence of iterated
midpoints and we replace affineness with preservation of (iterated)
midpoints. [Escardó & S., LICS 2001]
Interface and semantics
New constants and their meanings
1 : I
−1 : I
m : I→I→I
M : (N → I) → I
aff : I → I → I → I
[[1]] = 1
[[−1]] = − 1
x +y
[[m]] x y =
2
[[M]] f =
[[aff]] x y z =
X f (i)
2(i+1)
i≥0
(1 − z) x + (1 + z) y
2
Computation
I is an abstract datatype. We don’t specify a particular concrete
implemention, but it is vital that implementation is possible.
We have seen three possible implementations of the type I using
the type N → N in System T.
In System I, programming on real numbers is performed purely
using the interface, irrespective of implementation.
The program stays the same even if the implementation changes
(e.g., to improve efficiency).
Programs on reals are automatically extensional.
Some simple programming
x ⊕ y = m x y = M(x, y , y , y , y , y , y , . . . )
(So not actually necessary to include m as a primitive, since
derivable from M. Nonetheless included for convenience.)
0 = (−1) ⊕ 1
−x = aff 1 (−1) x
xy = aff (−x) x y
1
= M(1, −1, 1, −1, 1, −1, 1, −1, . . . )
3
More generally, any rational number is definable using M applied
to an eventually periodic sequence of 1s and −1s.
Still more generally, any real number with a T-definable binary
expansion is definable.
Equalities
mt t = t
mt u =mu t
m (m t u) (m v w ) = m (m t v ) (m u w )
M ti = m(t0 , M ti+1 )
i
i
λi. ti = λi. m(ui , ti+1 )
aff t u (−1) = t
=⇒
t0 = M ui
i
aff t u 1 = u
aff t u m(v , w ) = m(aff t u v , aff t u w )
For w : I → I → I, if w t u (−1) = t and w t u 1 = u and
w t u m(x, y ) = m(w t u x, w t u y ) then
w t u = aff t u
Consequences (exercise!)
−−x = x
x ⊕ −x = 0
x0 = 0
xy = yx
(x y ) z = x (y z)
x (−y ) = −(x y )
x (y ⊕ z) = (x y ) ⊕ (x z)
More programming, medial power series
Suppose
X
f (x) =
an x n
n≥0
where an ∈ [−1, 1]. Then
x n
1X
1 x f
=
an
2
2
2
2
n≥0
X
xn
=
an
2n+1
n≥0
= M an x n
n
So, for example:
1
2−x
x 1
exp
2
2
1
x
sin
2
2
1 x
ln 1 +
2
2
r
1
x
1+
2
2
= M xn
n
= M
n
xn
n!
n
n−1 x
= M parity(n)(−1)b 2 c
n
n!
= M (−1)n
n
= M
n
x n+1
(n + 1)
(−1)n (2n)!
xn
(1 − 2n)(n!)2 4n
All these define total continuous functions [−1, 1] → [−1, 1].
I-definable functions
A function f : [−1, 1]k → [−1, 1] is said to be I-definable if there
exists a closed System I term t : Ik → I such that
[[t]] x1 . . . xk = f (x1 , . . . , xk )
Any I-definable function is automatically continuous, indeed
computable.
Proposition. If f is I-definable then it is T-representable.
A non-definable function


if 21 ≤ x
1
dbl(x) =
2x if − 12 ≤ x ≤


−1 if x ≤ − 21
Note that dbl is continuous though not smooth.
Also dbl is computable, in fact T-representable.
1
2
A non-definable function


if 21 ≤ x
1
dbl(x) =
2x if − 12 ≤ x ≤


−1 if x ≤ − 21
Note that dbl is continuous though not smooth.
Also dbl is computable, in fact T-representable.
Proposition. The function dbl is not definable.
1
2
A non-definable function


if 21 ≤ x
1
dbl(x) =
2x if − 12 ≤ x ≤


−1 if x ≤ − 21
Note that dbl is continuous though not smooth.
Also dbl is computable, in fact T-representable.
Proposition. The function dbl is not definable.
Will prove this later.
1
2
System II
Extend System I with a new constant
dbl : I → I
denoting the function dbl : [−1, 1] → [−1, 1]
New equations:
dbl m(1, m(1, x)) = 1
dbl m(0, x) = x
dbl m(−1, m(−1, x)) = −1
One consequence is cancellation:
m(t, u) = m(t, v )
=⇒
u = v
Programming in System II
[x + y ] = dbl(x ⊕ y )
x y
= x ⊕ (−y )
[x − y ] = dbl(x y )
max(0, x) = [[x − 1] + 1]
max(x, y ) = dbl x2 + max (0, y x)
min(x, y ) = − max(−x, −y )
|x| = max(−x, x)
Question. Are any of the last 4 functions definable in System I?
A limit-finding function in System II
Define fastlim : (N → I) → I by:
fastlimi xi = dbl M dbln+1 (xn+1 xn )
n
Lemma. If |xn+1 − xn | ≤ 2−n then fastlimi xi is the limit of the
(fast) Cauchy sequence (xn )n .
Note that fastlimi xi always returns a value, even if (xn )n is
non-convergent.
Also if (xn )n converges slowly, fastlimi xi need not be the limit.
II-definable functions
A function f : [−1, 1]k → [−1, 1] is said to be II-definable if there
exists a closed System II term t : Ik → I such that
[[t]] x1 . . . xk = f (x1 , . . . , xk )
Proposition. If f is II-definable then it is T-representable.
II-definable functions
A function f : [−1, 1]k → [−1, 1] is said to be II-definable if there
exists a closed System II term t : Ik → I such that
[[t]] x1 . . . xk = f (x1 , . . . , xk )
Proposition. If f is II-definable then it is T-representable.
Theorem. If f is T-representable then it is II-definable.
Outline of proof
1. Use the T-representation t of f to construct a Cauchy
sequence of polygon approximations to f in System II.
2. Extract a modulus of uniform continuity for t on the compact
subset (N → {0, 1, 1})k .
3. Use the modulus of uniform continuity to obtain a fast
Cauchy sequence of polygon approximations.
4. A term defining f is then obtained using fastlim.
A gluing function
Define glue : (I → I)2 → I → I by
1
glue f g x = dbl dbl f dbl x + 12 ⊕ g dbl x − 12
2 f (1)
Then glue represents the function:
glue f g x = [f (dbl x + 12 ) + g (dbl x − 12 )) − f (1)]
which, whenever f (1) = g (−1), satisfies
(
f (2x + 1) if −1 ≤ x ≤ 0
glue f g x =
g (2x − 1) if 0 ≤ x ≤ 1
Polygon approximations
For every k ≥ 1, we define a System II term:
apprk : N → ((N → N)k → I) → (Ik → I)
The base case appr1 is defined by recursion on N to satisfy:
appr1 0 h = aff (h(−1)) (h(1))
appr1 (n + 1) h = glue (appr1 n (λx. h(avg(x, −1))))
(appr1 n (λx. h(avg(x, 1))))
Given apprk , the term apprk+1 is given by
apprk+1 n h x0 x1 . . . xk =
appr1 n (λy0 . apprk n (h y0 ) x1 . . . xk ) x0
The application apprk n h produces a piecewise multilinear
approximation to the function h, with the argument types changed
from N → N to I.
More precisely, the apprk n function uses k-tuples of values from
the set
Qn := {qni | 0 ≤ i ≤ 2n } where qni :=
i
−1
2n−1
to form a lattice of (2n + 1)k rational partition points in [−1, 1]k .
The application apprk n h then results in a function Ik → I that
agrees with h at the partition points, and is (separately) affine in
each coordinate between partition points. It is also affine in the h
argument.
The lemma on the next slide formalises this.
Lemma. If h : (N → N)k → I represents f : [−1, 1]k → [−1, 1]
then:
1. For all r0 , . . . , rk−1 ∈ Qn we have:
apprk n h r0 . . . rk−1 = f r0 . . . rk−1
2. For 0 ≤ j < k, 0 ≤ i < 2n , and 0 ≤ λ ≤ 1
xj+1 . . . xk−1 =
apprk n h x0 . . . xj−1 2i+λ
n−1 − 1
(1 − λ) apprk n h x0 . . . xj−1 qni xj+1 . . . xk−1
+ λ apprk n h x0 . . . xj−1 qni+1 xj+1 . . . xk−1
i
i+1 ).
Note that 2i+λ
n−1 − 1 = ((1 − λ) qn + λ qn
Also, if h1 , h2 : (N → N)k → I are real-respecting then
3. For 0 ≤ λ ≤ 1, we have:
apprk n ((1 − λ)h1 + λh2 ) x0 . . . xk−1 =
(1 − λ) apprk n h1 x0 . . . xk−1 + λ apprk n h2 x0 . . . xk−1
Continuity in System T
Given a closed System T term
t : (N → N) → N → N
it holds that [[t]] : (N → N) → (N → N) is continuous with respect
to the product topology on N → N (Baire space).
In detail, this means that for any α : N → N, we have:
For all e ≥ 0, there exists d ≥ 0 such that, for all
β : N → N with β d = α d , it holds that
[[t]](α) e = [[t]](β) e
(We write α n for the n-tuple (α(0), . . . , α(n − 1)) ∈ Nn .)
Uniform continuity in System T
Since N → {0, 1, 2} ⊆ N → N is compact, it follows that [[t]] is
uniformly continuous on this subspace; i.e.,:
For all e ≥ 0, there exists d ≥ 0 such that, for all
α, β : N → {0, 1, 2} with α d = β d , it holds that
[[t]](α) e = [[t]](β) e
A function µ : N → N is a modulus of uniform continuity if:
For all e ≥ 0, and all α, β : N → {0, 1, 2} with
α µ(e) = β µ(e) , it holds that [[t]](α) e = [[t]](β) e
Proposition. There exists a closed System T term Gt : N → N
that defines a modulus of uniform continuity for t.
An immediate consequence
Suppose t : (N → N)k → N → N represents f : [−1, 1]k → [−1, 1]
(using our System T encoding of signed binary).
Then there exists Gt : N → N such that, for all e ≥ 0, and for all
α1 , . . . , αk , β1 , . . . , βk : N → {0, 1, 2} with αi Gt (d) = βi Gt (d) , it
holds that
|f (real(α1 ), . . . , real(αk )) − f (real(β1 ), . . . , real(βk ))| ≤ 2−e
Completing proof of theorem
Suppose that t : (N → N)k → N → N is a closed term that
T-represents f : [−1, 1]k → [−1, 1].
Let Gt be a uniform modulus for continuity for t on N → {0, 1, 2},
as given on the previous slide.
Let gn : [−1, 1]k → [−1, 1] be defined by:
apprk (Gt (n + 1)) (real ◦ t) : Ik → I
Then for all x1 , . . . , xk ∈ [−1, 1],
|f (x1 , . . . , xn ) − gn (x1 , . . . , xn )| ≤ 2−n
Therefore the term below I-defines f .
λ x0 . . . xk−1 . fastlim (λn. apprk (Gt (n + 1)) (real ◦ t) x0 . . . xk−1 )
Summary of what we have done so far
We have considered two approaches to computation with real
numbers in the interval [−1, 1].
1. Intensional approach.
Encode a real as a stream/function.
Elaborated in the context of System T.
2. Extensional approach.
Encapsulate real numbers in an abstract datatype.
Elaborated in the context of Systems I and II.
Main result: both approaches equivalent in power.
Summary of what we have done so far
We have considered two approaches to computation with real
numbers in the interval [−1, 1].
1. Intensional approach.
Encode a real as a stream/function.
Elaborated in the context of System T.
2. Extensional approach.
Encapsulate real numbers in an abstract datatype.
Elaborated in the context of Systems I and II.
Main result: both approaches equivalent in power.
Pressing question: What about the full real line?
Part 3: A semi-extensional approach.
Mantissa-exponent representation of real line
We use System I (no need for II !).
For convenience we include product types σ × τ .
We use N to encode integers including negative integers. We write
Z when doing this, and use evident related notation.
Represent arbitrary real numbers using the type I × Z.
(x, z) represents the real number 2z x.
This is a semi-extensional representation of real numbers,
combining: a discrete scaling z, which is intensional; and a
continuous value x, which is extensional.
(An alternative choice would be to use (x, z) to represent z + x.
But this doesn’t work, not even if we use System II !)
Equivalence between reals
We reason in a setting in which we have intuitionistic first-order
logic over System I extending the equational logic we have
considered so far.
(Actually would be better to reformulate what follows, as far as
possible, to use only equational logic.)
Equality between reals is the smallest equivalence relation on I × Z
satisfying
x
, m+1
(x, m) ∼
2
This is can be defined equationally by:
(x, m) ∼ (y , n) ⇐⇒
x
2max(m,n)−m
N.B., we need to assume the cancellation law
x ⊕ y = x ⊕ z =⇒ y = z
=
y
2max(m,n)−n
Basic arithmetic
0 = (0, 0)
1 = (1, 0)
−(x, m) = (−x, m)
x
y
(x, m) + (y , n) =
⊕
,
max(m,
n)
+
1
2max(m,n)−m 2max(m,n)−n
(x, m) × (y , n) = (x y , m + n)
Can also define any rational number.
So can define polynomials with rational coefficients.
Basic equations
The operations −, +, × respect ∼.
The expected equations are all provable, e.g.:
0+x ∼ x
x+y ∼ y+x
(x + y) + z ∼ x + (y + z)
0×x ∼ 0
1×x ∼ 1
x×y ∼ y×x
(x × y) × z ∼ x × (y × z)
x × (y + z) ∼ (x × y) + (x × z)
...
Basic definitions
Define
|(x, m)| ≤ 2n ⇐⇒ ∃y . (x, m) ∼ (y , n)
(N.B., |x| ≤ 2n is defined as a primitive relation between x : I × Z
and n : N. Is it possible to define | · | : I × Z → I × Z in System I?)
Lemma. The following are equivalent
1. For all integers n, |(x, m)| ≤ 2n
2. (x, m) ∼ 0
3. x = 0
Define
0 ≤ (x, m) ⇐⇒ ∃y . x = 1 ⊕ y
x ≤ y ⇐⇒ 0 ≤ y − x
x < y ⇐⇒ ∃q ∈ Q. 0 < q and x + q ≤ y
Limits of fast Cauchy sequences
Suppose we have a sequence (xi )i given by:
x(−) : N → I × Z
such that the inequalities
|xi+1 − xi | ≤ 2−(i+1)
are witnessed by d(−) : N → I satisfying
xi+1 − xi ∼ (di , −(i + 1))
Then we define
lim xi = x0 + (M di , 0)
i
i
Directions for investigation
Question. Is every System T representable function : Rk → R
representable in System I by a term (I × Z)k → I × Z ?
Given that we can define rational polynomials and Cauchy limits, it
is not implausible that a positive answer to the question might be
obtained following a suitable constructive proof of the
Stone-Weierstrass theorem.
Question. If we add I as an abstract datatype to dependent type
theory, does this support the development of constructive analysis,
based on the semi-extensional real numbers defined via
mantissa-exponent?
Definability in System I revisited
Given that System I now seems more interesting than Part 2
suggested, we end by returning to questions of definability and
undefinability in System I.
We address the following.
I
All system I functions we have defined so far have been
smooth.
Are all I-definable functions smooth?
I
How do we prove that dbl is not I-definable?
And what does this tell us about I-definable functions more
generally?
Non-smooth I-definable functions
times∗ (x, y ) = aff (−1) x y
sq∗ (x) = times∗ (x, x)
∗
∗
∗ 7
, sq (−sq (−x))
g(x) = times
9
h(x) = M g3(i+1) (x)
i
H(x) = M (g3(i+1) (x))2
i
N.B., g defines the function
g(x) =
1 4 4 3 2 2 4
x − x − x + x
9
9
9
3
which satisfies:
g(0) = 0
g0 (0) =
4
3
I
h has derivative +∞ at 0.
I
H at 0 has derivative −∞ from below and +∞ from above.
Graph of H (thanks to Yitong Li):
Undefinability of dbl
Proposition. The function dbl is not I-definable.
We prove this using logical relations
For every type τ we define a binary relation
∆τ ⊆ [[τ ]] × [[τ ]]
by:
∆N (m, n) ⇐⇒ m = n
∆I (x, y ) ⇐⇒ if x ∈ {−1, 1} or y ∈ {−1, 1} then x = y
∆σ→τ (f , g ) ⇐⇒ ∀x, y ∈ [[σ]]. ∆σ (x, y ) implies ∆τ (f (x), g (y ))
Lemma. For every System I constant c : τ , it holds that
∆τ ([[c]], [[c]]).
E.g., to show ∆I→I→I (m, m).
Suppose ∆I (x, x 0 ) and ∆I (y , y 0 ).
(?)
Need to show ∆I (x ⊕ y , x 0 ⊕ y 0 ).
Suppose x ⊕ y ∈ {−1, 1}. W.l.o.g., assume it is 1.
Then x = y = 1. So x 0 = 1 and y 0 = 1, by (?).
Thus indeed x 0 ⊕ y 0 = 1 = x ⊕ y . And similarly for the case when
x 0 ⊕ y 0 ∈ {−1, 1}.
Lemma. For every System I constant c : τ , it holds that
∆τ ([[c]], [[c]]).
E.g., to show ∆I→I→I→I (aff, aff).
Suppose ∆I (x, x 0 ) and ∆I (y , y 0 ) and ∆I (z, z 0 ).
y (1−z 0 ) x 0 +(1+z 0 ) y 0
Need to show ∆I (1−z) x+(1+z)
,
2
2
Suppose
(1−z) x+(1+z) y
2
(?)
∈ {−1, 1}. W.l.o.g., assume it is 1.
There are three possibilities: (i) x = 1 and z = −1; (ii) y = z = 1;
(iii) x = y = 1.
In each case, the corresponding equations hold for x 0 , y 0 , z 0 , by (?).
Thus indeed
(1−z 0 ) x 0 +(1+z 0 ) y 0
2
=1=
And similarly for the case when x 0 ⊕
(1−z) x+(1+z) y
.
2
0
y ∈ {−1, 1}.
Using the previous lemma, the fundamental lemma of logical
relations gives us
Lemma. For every closed System I term t : τ , ∆τ ([[t]], [[t]]).
(The direct proof is by induction on term structure.)
But ∆I→I (dbl, dbl) does not hold.
Indeed we have ∆I 0, 12 but not ∆I dbl(0), dbl
(Since dbl(0) = 0 and dbl 12 = 1 )
Therefore dbl is not I-definable.
1
2
Using the previous lemma, the fundamental lemma of logical
relations gives us
Lemma. For every closed System I term t : τ , ∆τ ([[t]], [[t]]).
(The direct proof is by induction on term structure.)
But ∆I→I (dbl, dbl) does not hold.
Indeed we have ∆I 0, 12 but not ∆I dbl(0), dbl
(Since dbl(0) = 0 and dbl 12 = 1 )
1
2
Therefore dbl is not I-definable.
More generally, for any I-definable f : [−1, 1] → [−1, 1], if
f (x) ∈ {−1, 1} for some x ∈ (0, 1), then f is a constant function.
Course summary
1. Gödel’s System T serves as a minimal functional programming
language for implementing exact computation on real
numbers via intensional representations.
2. Systems I and II implement an abstract datatype for a closed
interval, offering an extensional approach to exact
programming on the interval. System II is equivalent in power
to the intensional approach.
3. A mantissa-exponent representation offers a semi-extensional
approach to computing on the full real line. Apparently, the
dbl function is not needed for this, so we can use System I.
Download