>> Leonardo de Moura: It's a great pleasure to... at CMU. He works in type theory and theorem proving....

advertisement
>> Leonardo de Moura: It's a great pleasure to introduce Cody Roux. He is a post doc
at CMU. He works in type theory and theorem proving. Today he is going to talk about
the structural properties of pure type systems.
>> Cody Roux: Okay. Well, I want to thank Microsoft and everybody for having me here.
It's really a pleasure to be here. I'm from the Department of Philosophy, so I'm going to
talk about philosophy. Hopefully there will be some intersection with computer science
and people will find some relevance to what's interesting to them.
The title is a little bit mysterious so I'm going to explain it of course, and really what I
want to do is ask a philosophical question about the notion of abstraction. It's actually a
series of questions and those questions I show can be answered in the framework of
pure type systems. And I'm going to explain what these questions are, what pure type
systems are and how I can use pure type systems to answer them. And the answer
these questions I need to examine something that I call structural properties of pure type
systems which explains the title of my talk to a certain extent, but I haven't said what
structural properties are. And then, basically at the end I'm going to give the main results
and try and justify the fact that they do answer these philosophical questions about the
notion of abstraction.
Okay, so in my original talk I had mathematicians and programmers because I was
talking to mathematicians. But programmers and mathematicians: basically the most
crucial part of their work is recognizing patterns and abstracting on them. Okay, this is
really what we do the most is try to understand these patterns and say, "Okay, this is an
important thing and I need to separate it and make it modular." This is a trivial example
but if you have 1 plus X plus 1 plus Y plus 1 you can say, okay, well here's a pattern; 1
appears plenty of times. And I can sort of express this pattern by introducing a lambda
here and saying every occurrence of one can be replaced by this abstract variable Z.
And then, I can apply this to 1 and it has the same meaning as above.
And you can imagine if 1 is somehow a very complex computation then you can say, oh,
well by abstracting here I've saved time because I only need to do this computation
once. The abstraction allows me to sort of reunite all of these individual computations
into one computation. But abstraction is also useful for mental processes. And the way
we do abstraction is in general this kind of scientific method where we have these
concrete observations that 2 plus 4 happens to be equal to 4 plus 2. And from these
number of concrete instances, we create abstract instances. And this is a very important
step: to go from concrete observations to abstract observations. Once we've done this
step basically we have this second step where we create a universal observation that
says that this concrete observation has been turned into an abstract statement which is
universally true. And then, mathematics tries to prove things universally but based on
these concrete observations.
Okay, so I want to understand what allows us to do this. What is the process involved in
having concrete instances, turning them into abstract instances and then allowing
yourself to form a universal statement about this? What is actually happening when I do
this? In particular when are we allowed to make an abstraction? And, when are we
allowed to make a universal quantification?
An interesting question I haven't really explained sufficiently yet is what do we get as a
result? What is the result of a universal quantification? What kind of category is in it? I'm
sorry, what kind of category is it in, with the nature of the universal quantification? Is the
proposition 2 plus 4 equals 4 plus 2 of the same nature as for all X,Y: X plus Y equals Y
plus X. I mean an obvious answer is yes; they're both propositions. But there are
refinements to this answer where we could say, oh, well really observing that 2 plus 4
equals 4 plus 2 is not the same as having this universal statement.
Okay, so there is something called the Curry-Howard correspondence. If you've heard of
it, that's great; and if you haven't, that's not a big deal. What the Curry-Howard
correspondence expresses is that there is some kind of relationship between universal
quantification and function spaces, okay, typically computable function spaces. This is a
really nice observation because it says basically building functions by lambda
abstraction is really the same thing as performing the introduction for universal
quantification. So we have these logical operations that correspond to programming
operations. Our question about universal quantification and what was the nature of the
universal quantification can be rephrased using this correspondence as a question about
function spaces. What does it mean to build a function space and what is the resulting
nature of that space?
Okay, now to ask my second question about abstraction and quantification I just need a
little background. The simply-typed lambda calculus is a very basic programming
language. In fact it's so basic that it's sort of a theoretical basis of programming
languages; it's very simple. So you have base types -- I'm going to use the pointer. So
you have these base types that express sort of atomic kinds of types you have in your
language. And then, you have functions on these types. And in particular you have
higher-order function C. Here I have a function that takes as input this F, which is itself a
function. Okay? So simply-typed lambda calculus basically has just these types and
higher-order functions and that's all that the simply-typed lambda calculus is made out
of.
And something that's kind of surprising if you don't know it is that in the simply-typed
lambda calculus every program is terminating. So you can run every program and you'll
get a result. And this is sort of an important observation. I'm sorry. So it's important
observation because it gives you a large number of consequences. In particular it says
something like, for example, you can decide equality between two terms of lambda
calculus. We'll see other consequences of termination.
Okay, so we have the simply-typed lambda calculus and it's very nice. And it has these
nice properties, but it's not a real programming language for many reasons. One of these
reasons is that there is no polymorphism. This is one of the basic requirements that you
want to have today; you don't want to write the same function twice. And in particular this
function lambda X, X that just returns its argument, you should be able to apply it to the
number three to get three. And you should also be able to apply it to the Boolean true
and get true. You don't want to write two functions: one which takes integers, for
example, and one that takes Booleans with exactly the same code. So you want to add
this feature to the language.
You have the simply-typed lambda calculus, how do you add this feature of
polymorphism? And what's interesting is that there are two possible answers to this
question. There are two ways to add polymorphism, and they are fundamentally
different. They all start the same way: you add variables at the type level. So you have
these simple types and now you have type variables, types that can be instantiated with
other types. And then you add quantification where you say, "Well we have a function
and it's of type for all X, X arrow X. Okay? For any potential type it has the type X arrow
X. Okay, now the two differences between these approaches to polymorphism is what
does the "for all" here quantify over? What are the possible instances of X? What am I
allowed to replace X by?
The first answer says I'm only allowed to replace X with simple types so no quantifiers in
the things I replace X with, and the second answer is any type including types that have
quantifications themselves. These two choices lead to dramatically different
programming languages. In the first case, we have polymorphism. We can do this
polymorphic identity but it's conservative in the sense that there isn't a fundamental way
to add new functions, functions that really do new kinds of computations. In the second
case you have this incredibly powerful system called System F which is a lot harder to
analyze but that has many, many new functions. And in particular, proving termination
for System F is still true but it's a lot harder. In the first case we've added polymorphism
but it's safe in the sense that there are no new functions. In the second case it's
somewhat unsafe because you get all these crazy new functions. Okay? And this
difference comes completely from what instantiations you allow for the universal
quantifications. This is a very important point on what it means to have polymorphism.
More generally you want to know what kinds of quantification will lead to conservative
extensions, and a conservative extension in my sense is something where you have
more expressive types but you don't have new programs. You can't write programs that
behave in a fundamentally different way. And this is often desirable because you want
types to be more expressive but you don't want to be able to write programs that do silly
things or wrong things or, you know, don't terminate for example.
Okay, I'm going to talk briefly about dependent types because it's a notion that
underlines everything that follows. A dependent type is a type which contains term level
information. And I mean I hope this example is going to be sufficient to justify sort of the
idea behind dependent types. Here I have a list with three elements. I would like to have
a type that can express the fact that the list has three elements, and to express that fact
it needs to contain a term. So VecN contains the term 3 which expresses that the List 1,
2, 3 has 3 elements. Okay? And we can quantify over these type level expressions as
well. In fact you'd want a function reverse to work for any length vector so for all N it
takes a vector with N elements and it returns a vector with the same number of
elements. Okay? So you're going to be able to quantify that. And same question, you
want to add these quantifications and you want to know if adding these quantifications is
safe if you have no new functions or if you've created something fundamentally more
powerful in the sense that you have different functions.
And to answer this question in a formal manner, I'm going to present an existing
framework that's called pure type systems that really allows us to express in a very finegrained way what the dependencies are, what it means when you quantify over a
variable in a type. Are there any questions at this point? Okay, so a pure type system is
a framework of many different type systems, and it's a generic framework that allows us
to express typed programming language. But using the Curry-Howard isomorphism, you
can also say it's a framework for expressing logics. Okay? So logics and programming
language are sort of two different sides of the same coin.
And one characteristic of pure type systems, what makes it a nice sort of playground for
understanding type systems is that it only allows universal quantification. It doesn't have
existential quantification. It doesn't have conjunction. It doesn't have data types. It just
has universal quantification and universal quantification is logical notion. The
corresponding programming language notion is that of dependent function space. Okay,
so a few facts about pure type systems that sort of additionally justify the fact that we're
interested in them: one, they're very expressive. You can actually find a pure type
system that is so expressive that it allows you to express all of set theory, so basically all
of mathematics can be expressed using a particular pure type system. Some pure type
systems are just incredibly powerful specification languages. They're relatively well
studied. They were invented in the 80's.
And here I cite Barendregt but he wasn't the only inventor. But, this book is basically the
Bible where he explains all the basic properties of pure type systems. And they've been
studied relatively extensively since the 80's so they're a well-understood framework. And
they're quite flexible; you can use them as sort of a core ideal -- I'm sorry. You can use
them as the core of a functional language. You can say, okay, I take a pure type system
and I'm going to use it to base a real world language like Haskell or a specification
language like
Agda or Coq. To do that I'm going to need to add some features but I'm going to start by
using a pure type system and then I'll features to this system. So studying pure type
systems gives us a starting point for studying these more complex programming
languages like Haskell and Coq.
However, the theory of pure type systems is something that can be really quite complex.
In particular they were invented in the 80's but there were several open questions which
is something that's quite unusual in type theory. I mean, open questions in type theory is
kind of a surprising fact. There are many open questions in complexity theory but in type
systems there aren't that many famous open questions. And then I named a couple but
it's not very important what they are, but there are these two open questions concerning
pure type systems.
Okay, so our questions about adding polymorphism and understanding quantifications:
can they be answered using pure type systems? This is a rhetorical question because
I’m arguing that we can. But to give my argument I need to explain what a pure type
system is and a pure type system is completely described by these three things. The first
one is a set of sorts. Okay? Just any arbitrary set that we call the set of sorts, S. The
second one is a relationship between elements of S that call axioms. And the third one is
a ternary relation on elements of S that I call rules. And that's it; that's all you need to
describe a pure type system. Now of course I haven't talked about type systems yet, but
once you have this data you have everything you need to understand how the type
system is described.
I'm going to explain what informally these pieces of data mean. The elements of S
represent a category of objects, a type of objects. Type is such a loaded word that I'm
kind of reluctant to use it. But every element of S represents informally objects that are
alike in some manner. This is kind of vague so I'm going to give examples. For example,
star -- this star here -- as an element of S is often the symbol used to represent the
category of propositions. So every proposition is in star, okay, is in the category star.
And box is often used to represent the category of all types or of all sets. Okay? And
yotta is traditionally used to represent the category of natural numbers.
So yotta represents all the natural numbers. So every element of S represents, like this,
a category of objects. And then, if S1 and S2 are in A which means that there is axiom
S1, S2, this informally means that S1 is a member of the category S2. Remember that
each elements of S represents a collection, a category of like objects. Well, sometimes
other sorts can be members of that category. Okay? So if S1, S2 is in A then S1 is a
member of the category S2. The third one is a little bit more complex but it specifies in
which way we can quantify over parameterized elements of a category. So if I have an
element of S2 that depends on a parameter of S1, I can universally quantify over that
parameter and it gives me a result in S3. Okay? I'm going to give examples of this later
but basically this says that if A is an element of the sort S1 and for each element of A, B
of X is an element of the sort S2 then you quantify over all these X's and you'll end up in
S3.
Okay? So we write pi instead of for all, but that's just notation.
>>: In your last slide, [inaudible] with S3, the last colon: where does it bind? It says the
entire for all expression...
>> Cody Roux: Yes.
>>: ...is a member of...?
>> Cody Roux: Yes. I've had this question before. I should reduce the space.
>>: [Inaudible]
>> Cody Roux: Okay. This is one object which is the universally quantified statement
basically where for every X and A, B of X holds or an element of B of X. And all this is in
the category S3 and sort S3.
Yeah, sorry about that. My kerning is poor. Okay, so I write pi instead of for all. That's
just a tradition. Okay, so given a PTS P I'm going to introduce formally what the type
system associated to P is. I said P was sorts, axioms and rules, and now I'm going to
say what the type system associated to those sorts, axioms and rules are. And that's
really what we're going to study. We don't care about the set, the axioms and the rules;
we care about this type system that's determined by those things.
So the first one just says that if S1, S2 is an axiom then we can derive the judgment S1
is of type S2. This is unsurprising. That's exactly what I said S1, S2 and A meant. And
the second one is this more complicated statement about rules which says that if A is of
type S1 and if B is of type S2 under this assumption that X is of type A then pi X A,B is a
type which is in the sort S3 if these three types are in our rule. Okay?
So these two rules sort of tell us how to build these basic types which are the pi's and
the sorts. And once we have that we can build terms, and there are only three ways to
build a term which is either it's a variable in the context with this first row Var or it's an
abstraction that builds a pi, that's builds an elements of the pi type. Or it's an application
where I have an element of the pi type, I have a term that's of the domain and I apply the
function to the term. So this is a way of constructing pi's and of destructing them. And
you can see that this looks like the universal quantification rule.
If I have for all X and A, B; well then, in particular B of U holds, okay, if U is of type A.
This is really like universal quantification. And this is similarly if generically for an
arbitrary X and A, B holds then for all X and A, B holds. But it's also like a function type:
if given an X and A I can build an element of B then I can build a function that takes an
element in A and returns an element in B. And this is the same. If you ignore this, if you
have a function from A to B and an element of A, you can just apply it and get an
element of B. So that's all there is to it. This is completely what pure type systems is
except this conversion rule which says that if two types are equal in some sense then
any element of the first type is also an element of the second type. And equal in some
sense means equal with respect to this computational rule which is just the ordinary beta
reduction rule where you have lambda applied to a term just is equal to that term where
you replace the variable by the argument.
We have four things really.
>>: Can I ask a question?
>> Cody Roux: Yes.
>>: How come you have a kind of a typed beta reduction, why do you require A prime to
be sorted as opposed to getting it as a property of beta reduction, that it preserves
sorts?
>> Cody Roux: A beta reduction preserves sorts but beta expansion doesn't. So this is
kind of a sanity check that says I haven't expanded things and introduced some well
typed creature that would then disappear when I beta reduced. There are some versions
of pure type systems that don't have this requirement, but in general this is mostly a
sanity check. We say...
>>: Even beta expansion can be designed to be well typed, right?
>> Cody Roux: You could limit that expansion to only well typed beta expansions, but
that significantly complicates the theory of pure type systems. [Laughing] I mean one of
the opening conjectures about pure type systems is that this untyped conversion here
and the typed conversion lead to the same underlying systems. It's completely nonobvious. And it's quite surprising because it seems obvious. It seems obviously true that
if you limit these conversions to well typed conversions, no harm should come. But it's
actually very hard to prove. So I'd say don't concentrate too much on this. I've added it
for convenience but there are technical reasons for why it's there. Does that answer your
question? We can come back to it.
And this is basically all there is. There are only structural rules that I've omitted that I call
the boring rules. But this is all there is to it: just sorts and pi types, abstraction,
application, variables of course and then this conversion rule. And that's all there is to it.
That's a very simple type system but it allows us to model this vast array of different
programming languages and at least what it can do is the simply typed lambda calculus.
So one way to model the simply typed lambda calculus using this framework is to
introduce these two sorts, yotta and star. This is going to represent a base type, say, of
natural numbers.
And this is going to represent the sort of all types. And then, the type of natural numbers
of course is a type so it's in the sort of all types so we add this axiom. And now this rule
says that if we -- I'm sorry -- if for each element of a type we have the element of type
then we can perform the abstraction and it gets a new type. I'm sorry. This was poorly
explained. This rule says that if given the elements of a type we can form a type then I
can build the abstraction that takes an element of this type and returns an element of
this type. And this still lives in star. Okay? This is basically the rule that allows us to
perform arrow types like this. Okay? So I can form this identity function on yotta by
abstracting over X and returning X. This is of type yotta arrow yotta where you yotta
arrow yotta is just pi X of type yotta, yotta. And the fact that we can form this pi type uses
this rule. So we have the simply typed lambda calculus with this rule because we can
form these pi types using this rule, and these pi types correspond to just function types.
All right?
Now I'm not going to go into too much detail but I do want to stress the fact that using
this framework of pure type systems you can build these very, very rich systems. So I
just showed that we can build the simply typed lambda calculus. You can also build the
simply typed lambda calculus where instead of having a single base type you can
declare these abstract base types in the context. And this is just presented in a slightly
different way where we have the type of all types has itself a type. And that's the only
difference. And in this second version basically you can declare abstract types that you
call ABC and all these are type variables but you can't quantify over them.
The next system, often call it star colon star or type colon type. And it's the same at a
simply typed lambda calculus except star is of type star. The type of all types is a type.
And it seems relatively innocent; I mean it's the same as the previous one but it just has
this different axiom. System F which allows general type quantification, it looks like the
simply typed lambda calculus again except it has this rule which allows us to quantify
over general types. So we can form this polymorphic identity rule where we say for every
type X, X over X is a type. And we use this rule to form that quantification. And then, the
calculus for constructions additionally adds these two forms of quantification. And these
correspond intuitively to type constructors like list which is a constructor which takes a
type and returns a type and dependent types which are types that can depend on
values. The values here, there's the type. And you say, well you're allowed to have types
that depend on values using this rule. Okay? So all these different rules allow different
kinds of function bases to be built. This is the ordinary function space of functional
programming languages. This is polymorphism. This is type constructors and this is
dependent types. So we have this nice little picture where every rule corresponds to a
kind of typing rule we want to allow of type construction.
And then, this U minus is mostly important for historical reasons. I just wanted to show it
because it's very similar to system F but it also allows polymorphism at the kind level.
This is kind polymorphic. And then, CC omega is basically the core of coq. And it's a little
bit more complicated because it has this infinite set of sorts. It has star and box i for any i
and natural number. And this has a bunch of rules where you say star of type box i for
any i and box i is of type box j, for any i and j such that i is smaller than j. Okay? And it
has these complicated rules which are a generalization of the rules for the calculus
constructions.
It has this infinite hierarchy of sorts, and this forms the basis of the calculus
constructions. This mainly serves the purpose to show that we can express all these
really powerful or really interesting type systems just by a very simple set of sorts, rules,
rules and axioms. Now normalization in these systems is a very...
>>: [Inaudible] If you had box subzero is that the same as star?
>> Cody Roux: I'm sorry. What line?
>>: The last line. What if -- So you have a box sub i.
>> Cody Roux: Right here?
>>: Yeah, point to it. It's left. For example. Or any one of those boxes, right?
>> Cody Roux: Yes.
>>: My question is, is box zero the same thing as star? Or are they different?
>> Cody Roux: No. Box zero is just right above star. Yeah, it's star then box zero then
box 1 and box 2. That's sort of the mental picture you need.
>>: I think you can explain the box i star, star. This rule shows that the star has a
special status, right?
>> Cody Roux: Yes. Yes. Yeah, okay, yes. If your point was, is this star redundant now
that we have all these infinite boxes, the answer is actually no because of this rule. And
thank you, Leo, for pointing that out.
>>: I guess also if you had box of zero equal to star would that mean that -- Oh, it's
strictly less than. Okay. I was wondering if that led to inconsistent with star colon star.
>> Cody Roux: Well, I haven't said that star colon star is inconsistent yet. But, yeah, it
would. You'd basically contain this system. But, yes, star has a special status. You have
box i star star, whereas here K has to be bigger than the maximum. Okay? So if star was
box minus one, say, this rule would be violated. Okay, what it does mean for a pure type
system to be normalizing? It just says that if you're well typed using this type system
corresponding to the pure type system then you have a beta-normal form. That's what it
means to be normalizing. And normalization is quite nice because first of all it ensures
decidability of type checking which is the least you could ask of a type system is that you
could decide whether a term has a type or not. And normalization gives you that
guarantee. It allows you to compare terms which is necessary to be able to perform type
checking.
And the second thing it implies is if you view the system as a logic, if you view this pi
quantification as a universal for all and you think about these types as propositions then
normalization implies consistency of this logic. It implies that obviously false types are
uninhabited or that not all types are inhabited which is in general what consistency of a
logic means. Not all propositions are provable. Normalization gives you that property
because you can look at normal forms and you can prove that some types cannot
possibly have an inhabitant in normal form. So normalization is just a really important
property that you'd like to be able to guarantee for a few types systems.
The thing is it's really hard to predict. I took all the type systems we had previously with
these axioms and rules. And simply typed lambda calculus is normalizing of course in
the two versions. But then you have this sort of similar system where this is the only
difference and all of a sudden it's not normalizing any more. Now in retrospect you could
say, well, it's obvious because a type is of type itself and so there is this kind of
circularity. But I can assure you that it's not obvious to find a counter example that
actually is non-normalizing. So this came as kind of a surprise.
System F which doesn't seem much simpler than star star is normalizing and so is the
calculus for constructions that looks really complicated. This U minus it's just a little bit
more complicated than system F. You only allow this extra polymorphism over kinds.
This is not normalizing. Okay? This also kind of came as surprise. And this crazy system
that has this infinite tower of sorts, it is normalizing. So it seems like there's this sort of
random jump where you have normalizing sometimes and not normalizing sometimes,
and it's very difficult to predict which one it's going to be. One thing that was unclear last
time I gave this talk: it's unknown whether this problem is decidable or not. Nobody
knows if there's an algorithm where you put in a type system and it outputs yes if it's
normalizing and no if it's not. This is just a very hard problem. So how do we attack this
problem?
The mathematician in me says, well when you have a hard question, you have to decide
to not answer it. Okay? You say, okay, this question is too hard. I'm going to ask a
different question. And hopefully this different question is going to shed some light on the
original hard question.
So the different question I ask is given normalizing PTS's what are the operations that
preserve normalization? What can do to a PTS, how can I modify a PTS in such a way
that the resulting PTS I have is still normalizing if the original ones are? And this is what I
call the study of the structural theory of pure type systems. I want to examine the set of
all pure type systems and understand the structure of this large set. I want to understand
the interaction of how you construct new pure type systems in ways that preserve
normalization.
Here's a PTS that I call MinLog because it's minimal logic, minimal implicative logic. So
it's a very, very simple logic where I just have a sort of a propositions and then I have
this sort of worlds which is just a sort containing the sort of all propositions. And I have
this rule that says you can build new propositions as implications. This rule allows me to
build implications. Now you may notice that this is just exactly the simply typed lambda
calculus from earlier, but it's seen as a logic. So I renamed the sorts so that it was more
apparent because this is a logic, okay, which is just minimal implicative logic. So it's a
very simple logic and in particular since it normalizes it's a consistent logic. You know
that this logic is consistent but it can only express implications, which is not that
fascinating. I mean first-year logic students can easily understand this.
We want to examine terms now, so we build this new PTS which contains this sort which
is the sort of all sets. I'm going to look at sets of terms and I need a sort to classify these
sets of terms. Now I have this other sort that's going to allow me to build term
constructors, functions that from terms build other terms. Okay? So set is of type Univ
which allows me to declare set variables, say suppose we have a set A.
And these two rules which seem more complicated than the previous ones are actually
simpler. I say that if I have a function from set to sets that lives in the new sort Fun, okay
-- So once I have a function, the function space is not a set anymore. So I can build the
functions but I can't keep iterating this which means I only have sort of first-order terms. I
can build functions -- Okay, this allows me to build functions with several arguments. But
these two rules only allow me to build first-order functions, functions that take a number
of arguments in a set and that return an element of another set. So this is a really nice
PTS in that it only allows me to build first-order terms. Okay?
I have a PTS with simple propositions. I have a term language with first-order terms.
Now what do I want to do? I want to build a new PTS that allows quantifying over these
terms using the simple logic MinLog. And to do that I build a new PTS which just takes
the sorts of MinLog and Term, puts them together and adds a new sort, Multiverse which
is unrelated to all the other sorts. And I keep the same axioms and rules so I still have
my terms on one side and my propositions on the other side, but now I'm allowed to
make propositions depend on sets. Propositions that depend on sets still are
propositions. And this allows me to form universal statements about terms. Okay?
And these two rules -- Wait. What is worlds already? Ah, yes. These two rules allow me
to quantify over all propositions. I can quantify over all propositions using this over a
proposition, so I can say for all P, P implies P for example. But the resulting proposition
is of a new nature. It's this conservative extension I talked about earlier. I want to be able
to quantify over all propositions but then I want this to be a higher level proposition, a
new proposition.
>>: [Inaudible] worlds from the [inaudible]?
>> Cody Roux: Yes. It's the type of prop.
>>: Okay, I see.
>> Cody Roux: So if I quantify over all propositions then the sort of that type is worlds.
Okay? So now these two rules allow me to quantify over all propositions. And this rule
allows me to quantify over terms to get a proposition that talks about all terms in a set.
>>: And the last one is just giving you like a [inaudible]?
>> Cody Roux: Yes. This allows me to quantify once over all propositions and this
allows me to quantify again over all propositions. So usually these two rules come
together.
>>: And sorry one more question. Why did you -- I mean if you didn't know anything
about what these sorts actually meant semantically, you could try other combinations.
And in prop set prop and that may not make much sense but just syntactically applied.
>> Cody Roux: Yes.
>>: Did you design these particular constructions because you were aiming for firstorder logic?
>> Cody Roux: Yes.
>>: Okay.
>> Cody Roux: Yes. But I'm going to show in the next slide that there's a theorem that
allows me to say things about these special rules. So it's a theorem that takes these
rules and says, "Oh, well they satisfy a certain criterion." And criterion they satisfy in
particular implies that my new PTS is normalizing if and only if the old PTS's were
normalizing. Okay? So these rules have a shape that is simple in some sense. It doesn't
add anything to the pure type system.
>>: So the first rule I see is having some interaction between the two PTS's, right?
>> Cody Roux: Yes.
>>: The next two rules are all from compositions of sorts from just the first rule, the first
PTS alone.
>> Cody Roux: Yes, that's true. But in some sense these rules allow more expressivity
about propositions, but they don't allow us to prove more propositions. It's a conservative
extension. So all the propositions in the original logic that were unprovable are still
unprovable. I can't prove any new propositions.
>>: So could I have started with the simply typed lambda calculus and forgot about the
terms and just taken your second and third rule?
>> Cody Roux: Yes. That would've given me polymorphism.
>>: Okay. That would still have been -- You're going to tell us about a theorem that
would ensure that that system is still [inaudible]?
>> Cody Roux: Yes. So this answers the question I asked at the beginning of this slide
which is how do we add polymorphism in a safe manner? And this is the answer. I
mean, these two rules add polymorphism in a safe manner. And since they satisfy one of
my theorems, you're guaranteed just syntactically that you can't break normalization.
Yeah, so in more detail I'm explaining that this rule -- set, set, prop -- allows building
propositions that depend on terms.
So if I have a set variable A and I have a proposition that depends on A, I can form this
because of my set, set, prop rule. I'm sorry, no. Yes, I can form this because of my set,
set, prop rule and I can form this universally quantified statement which says that for
every X and A, P of X implies [inaudible]. And I can ever prove it but that's not the
important part. The important part is being able to express statements that depend on
terms. All right? So my set, set, prop allows for these kinds of propositions. And the
worlds, prop, multiverse allows us to additionally quantify over P, for example. So I can
say for any P, which is a predicate on the set A, and for any X and A, P of X implies P of
X. And this is of type multiverse, okay; this is not a proposition. It's something that lives
in a higher world.
>>: Did you have set, set, prop? You had set, set, fun and then set, prop.
>> Cody Roux: Oh, yes. That's a typo. I'm sorry.
>>: Okay. So set, prop, prop.
>> Cody Roux: Yes.
>>: Okay.
>> Cody Roux: Darn. I missed that one the first time around. Yeah, yeah. Set, prop,
prop. So you see this is a set and this is a proposition, and the result we get is a
proposition. Okay, so the theorem is if MinLog and Term are normalizing then this new
FOL PTS I've just constructed is normalizing. Okay? And in fact we can additionally
show the FOL is a conservative extension which means that there are no propositions
which were unprovable in MinLog and that are now provable in FOL. Okay? We can
express new things and prove them but certainly the old things haven't changed. There
are no new provable propositions from MinLog. All right?
And in general -- So I've basically said what I wanted to say, but I am going to show how
we express formally this theorem. And the formal expression of the theorem basically
says I can take two PTS's, P and Q, and I can take their disjoint union, okay, where I just
take the sorts and I suppose that they have zero intersection. I just put them together
and I get a new PTS. And the first theorem is if you are typing this new PTS then you're
typed in one of the old PTS's. But this theorem is a little bit subtle to express because
this context could be mixed. It could be a mixed context that takes some types from P
and some types from Q.
You have to sort of filter out the unwanted contexts. So this proposition is actually nontrivial. And this theorem that says that in the disjoint sum -- if you're typable in the disjoint
sum then you're typable in one of the two. It implies normalization of the disjoint sum.
Okay? So forming disjoint sums is safe. But I've done two things: I've formed the disjoint
sum and then I've added these extra rules. And these extra rules I've added are all of the
form either SKK, like the set, prop, prop rule I added where S is one of the PTS's and K
is in the other one of the PTS's. The intuition is that these rules if we add them for every
sort in P and every sort in Q, creates the pure type system which is the Q logic of P
terms.
P was the PTS that allows us to form terms and sets. And Q in MinLog was the PTS that
expressed minimal logic. And so by adding these rules we've formed a logic which talked
P terms which had propositions from the Q logic. And the second thing we did -- I'm
sorry. So the theorem is that this new PTS with these extra rules is normalizing if and
only if both PTS's are normalizing, so P and Q are normalizing.
And the second thing we did was allow quantification over all propositions. And to do
that we added this new special sort that I called multiverse that here I call S triangle K.
And then, I added these two rules, okay, which say I can quantify over S for any K but I
have to bump up the result. I have to end up in this new sort. And this allows me to keep
quantifying over S, to quantify several times over S. And I call the resulting PTS P hat.
And the intuition is that you can quantify over S parameterized K's using these rules.
And my result, again, is that if I add all these extra rules I'm normalizing if and only if I
was originally normalizing. So these are two methods to combine pure type systems
while preserving normalization. Okay, and the proof: I'm not going into it but it involves
identifying unique sorts associated with each redexe. And the sort is going to classify the
redexe, and we need to sort of consider each redexe individually. We can erase all the
redexes that come from one PTS and just examine the term cleaned of all these
redexes. And on the other side we can just examine each individual redexe from the
other PTS. And then, we need a commutation result which says, well some reductions
can be done in one PTS and they won't interfere with any of the redexes done in the
other PTS.
So this is very combinatorial proof and it uses ideas from Bernardy and Lasson who
have this similar goal of trying to enrich PTS's so that they can express more things.
Okay, wow. Yeah, I'm pretty early. I guess nobody is going to complain about that. So
pure type systems can be used to ask and to answer questions about quantification. The
main question we asked was what kinds of quantifications can we add to a logic without
risking inconsistency or without making it more powerful in the sense of more things are
provable, just more powerful in the sense more things are expressible.
My second observation which is kind of I feel the real message I want to get across is
that it's interesting to study normalization preserving extensions. Often we take a PTS
and we try to prove it's normalizing sort of independently of everybody else. We sort of
put it in a black hole and we study just that PTS. But what I'm saying is that we should
study combinations and interactions between PTS's. And this is an interesting field of
study because it allows us to build richer type systems. We were just talking about F star
yesterday. An F star can be seen as a very rich type system which is made of these
different components that we put together in a larger system which is very expressive.
Okay, so I've showed that certain rules can be added safely. I'd like to be able to say,
okay, using these proof techniques, these are all the possible rules which can be added.
I think there are more rules that are safe, that can be proven safe but I haven't
characterized exactly which rules can be added to a PTS. And the question here is can
we take systems that are sort of built like this, that sort of have this term component and
this logical component, and can we simply for example consistency proofs using this
approach. Often you want to show normalization of the propositional side but you have
these terms that come in and so you're scared they interact in some manner and you
have to show that they don't. I'm wondering if this result is general enough to be able to
say, okay once and for all this is what you need to show to be sure that there's no
interaction between the logic and the programming side of your type system.
Okay, so of course an obvious extension is what happens when we add inductive types?
What happens when we add existential types? What happens when we make our type
system more complicated? What point do we lose these nice results?
And finally this is more speculative: it would be nice if we had a proof that was less
combinatorial. Often there are proofs of conservativity, there's a combinatorial version
which is very hands-on and then there's a semantic version which is a little bit more
abstract but often much more succinct and more powerful. So I'd like to be able to
understand a model theoretic view of these conservativity results.
That's all I had to say about this, so thank you very much. These are references if you
want to see them, and I'd be happy to answer any questions.
[Applause]
>>: Have you taken your theorems through some proof of system?
>> Cody Roux: No. There are parts of the theorem which are reasonable
straightforward which are just rewrite theory, but there is a part that is quite technical. In
particular this disjoint sum separating, it's quite technical. So it would be worthwhile effort
to prove this theorem formally. I'm afraid I don't have much experience with proving
formal statements about programming languages and coq. But, yeah, it would be a
worthwhile effort. And it's not sort of unfeasible because there has been a lot of work on
pure type systems already.
>>: [Inaudible] but there are other systems [inaudible].
>> Cody Roux: Yeah, there are other systems but pure type systems have been
studying coq. I'm sure of that. I don't know if they've been studied in other things.
Probably Agda, I don't know. [Inaudible] maybe not because they're not as interested in
dependent types but maybe there are things in [inaudible].
>>: Quick question: so have you see some people who are working on type systems
that -- So the system for example got Trellis which chooses to include star, cone, star,
letting parts of the system be inconsistent but tries to isolate within that system a core
that remains consistent. Have you thought about how you might try to do that in your
setting? Like have some part of the system that is [inaudible] crazy, but you have some
theorem about some subset of your type system?
>> Cody Roux: Yeah. So Trellis is interesting because it was part of the motivation for
this result because Trellis has some non-termination, for example. But that's the
programming part. And then, you want this logical part that's terminating and you want to
say, oh well there's this separation. I mean, you want to have this system and you only
have Q that's normalizing. Okay? But we can show that if Q is normalizing then in this
system if you have someone that comes from Q, it has a normal form. And so that's one
of the motivations I had. You want to show this in Trellis and it's hard. But then there's
additional thing where they have this inconsistent type system but they say, "Okay, if
after the fact you show that you haven't actually used this rule then you're okay." And I
haven't really thought about that but in general it's just for convenience in some sense. I
think if they could add universes in a simple extensible way they would've added
universes.
And they just say, "Okay, we put type, type and it's convenient. But someday we'll add
universes and we won't need type, type anymore."
>>: I'm not sure I think of it only as a convenience. I mean it's also an expressiveness
thing. You get to write non-terminating programs because you may want to.
>> Cody Roux: Yeah, but adding type, type...
>>: [Inaudible] same properties about those things in a consistent way.
>> Cody Roux: I agree. And you want the non-termination to be in your term language
and not in your proof language. And, yes, this approach is sort of has the ambition to be
able to analyze these types of systems. But I think type, type is sort of beside the point.
When you add non-termination you want to add a fixed point operator. You don't want to
add type, type because it's an open question whether you can actually write a fixed point
combinatory in type, type. You can have non-terminating terms but you don't have a
fixed point combinatory. So type, type is kind of an accident. It's just a convenience. It
dispenses you from having to think about universes. It makes everything simple. It just
makes things inconsistent which is kind of an inconvenience. But, yeah, you want nontermination. You want fixed points. And here you definitely want termination because it's
a logic.
And this work is definitely aimed at saying things when P is non-terminating and when Q
is terminating. I haven't talked about this because it's a little bit less pretty to express.
But you can have a theorem to that effect.
>>: You do have such a theorem?
>> Cody Roux: What?
>>: You do have such a theorem like the case where P is...
>> Cody Roux: Yeah. But you can't just for all P, Q is normalizing. You'd have to say for
all terms that are of a sort which comes from Q, that term has or had normal form. So it's
a little bit less elegant.
>>: Okay.
>> Cody Roux: Well, thank you.
[Applause]
Download