>> Leonardo de Moura: It's a great pleasure to introduce Cody Roux. He is a post doc at CMU. He works in type theory and theorem proving. Today he is going to talk about the structural properties of pure type systems. >> Cody Roux: Okay. Well, I want to thank Microsoft and everybody for having me here. It's really a pleasure to be here. I'm from the Department of Philosophy, so I'm going to talk about philosophy. Hopefully there will be some intersection with computer science and people will find some relevance to what's interesting to them. The title is a little bit mysterious so I'm going to explain it of course, and really what I want to do is ask a philosophical question about the notion of abstraction. It's actually a series of questions and those questions I show can be answered in the framework of pure type systems. And I'm going to explain what these questions are, what pure type systems are and how I can use pure type systems to answer them. And the answer these questions I need to examine something that I call structural properties of pure type systems which explains the title of my talk to a certain extent, but I haven't said what structural properties are. And then, basically at the end I'm going to give the main results and try and justify the fact that they do answer these philosophical questions about the notion of abstraction. Okay, so in my original talk I had mathematicians and programmers because I was talking to mathematicians. But programmers and mathematicians: basically the most crucial part of their work is recognizing patterns and abstracting on them. Okay, this is really what we do the most is try to understand these patterns and say, "Okay, this is an important thing and I need to separate it and make it modular." This is a trivial example but if you have 1 plus X plus 1 plus Y plus 1 you can say, okay, well here's a pattern; 1 appears plenty of times. And I can sort of express this pattern by introducing a lambda here and saying every occurrence of one can be replaced by this abstract variable Z. And then, I can apply this to 1 and it has the same meaning as above. And you can imagine if 1 is somehow a very complex computation then you can say, oh, well by abstracting here I've saved time because I only need to do this computation once. The abstraction allows me to sort of reunite all of these individual computations into one computation. But abstraction is also useful for mental processes. And the way we do abstraction is in general this kind of scientific method where we have these concrete observations that 2 plus 4 happens to be equal to 4 plus 2. And from these number of concrete instances, we create abstract instances. And this is a very important step: to go from concrete observations to abstract observations. Once we've done this step basically we have this second step where we create a universal observation that says that this concrete observation has been turned into an abstract statement which is universally true. And then, mathematics tries to prove things universally but based on these concrete observations. Okay, so I want to understand what allows us to do this. What is the process involved in having concrete instances, turning them into abstract instances and then allowing yourself to form a universal statement about this? What is actually happening when I do this? In particular when are we allowed to make an abstraction? And, when are we allowed to make a universal quantification? An interesting question I haven't really explained sufficiently yet is what do we get as a result? What is the result of a universal quantification? What kind of category is in it? I'm sorry, what kind of category is it in, with the nature of the universal quantification? Is the proposition 2 plus 4 equals 4 plus 2 of the same nature as for all X,Y: X plus Y equals Y plus X. I mean an obvious answer is yes; they're both propositions. But there are refinements to this answer where we could say, oh, well really observing that 2 plus 4 equals 4 plus 2 is not the same as having this universal statement. Okay, so there is something called the Curry-Howard correspondence. If you've heard of it, that's great; and if you haven't, that's not a big deal. What the Curry-Howard correspondence expresses is that there is some kind of relationship between universal quantification and function spaces, okay, typically computable function spaces. This is a really nice observation because it says basically building functions by lambda abstraction is really the same thing as performing the introduction for universal quantification. So we have these logical operations that correspond to programming operations. Our question about universal quantification and what was the nature of the universal quantification can be rephrased using this correspondence as a question about function spaces. What does it mean to build a function space and what is the resulting nature of that space? Okay, now to ask my second question about abstraction and quantification I just need a little background. The simply-typed lambda calculus is a very basic programming language. In fact it's so basic that it's sort of a theoretical basis of programming languages; it's very simple. So you have base types -- I'm going to use the pointer. So you have these base types that express sort of atomic kinds of types you have in your language. And then, you have functions on these types. And in particular you have higher-order function C. Here I have a function that takes as input this F, which is itself a function. Okay? So simply-typed lambda calculus basically has just these types and higher-order functions and that's all that the simply-typed lambda calculus is made out of. And something that's kind of surprising if you don't know it is that in the simply-typed lambda calculus every program is terminating. So you can run every program and you'll get a result. And this is sort of an important observation. I'm sorry. So it's important observation because it gives you a large number of consequences. In particular it says something like, for example, you can decide equality between two terms of lambda calculus. We'll see other consequences of termination. Okay, so we have the simply-typed lambda calculus and it's very nice. And it has these nice properties, but it's not a real programming language for many reasons. One of these reasons is that there is no polymorphism. This is one of the basic requirements that you want to have today; you don't want to write the same function twice. And in particular this function lambda X, X that just returns its argument, you should be able to apply it to the number three to get three. And you should also be able to apply it to the Boolean true and get true. You don't want to write two functions: one which takes integers, for example, and one that takes Booleans with exactly the same code. So you want to add this feature to the language. You have the simply-typed lambda calculus, how do you add this feature of polymorphism? And what's interesting is that there are two possible answers to this question. There are two ways to add polymorphism, and they are fundamentally different. They all start the same way: you add variables at the type level. So you have these simple types and now you have type variables, types that can be instantiated with other types. And then you add quantification where you say, "Well we have a function and it's of type for all X, X arrow X. Okay? For any potential type it has the type X arrow X. Okay, now the two differences between these approaches to polymorphism is what does the "for all" here quantify over? What are the possible instances of X? What am I allowed to replace X by? The first answer says I'm only allowed to replace X with simple types so no quantifiers in the things I replace X with, and the second answer is any type including types that have quantifications themselves. These two choices lead to dramatically different programming languages. In the first case, we have polymorphism. We can do this polymorphic identity but it's conservative in the sense that there isn't a fundamental way to add new functions, functions that really do new kinds of computations. In the second case you have this incredibly powerful system called System F which is a lot harder to analyze but that has many, many new functions. And in particular, proving termination for System F is still true but it's a lot harder. In the first case we've added polymorphism but it's safe in the sense that there are no new functions. In the second case it's somewhat unsafe because you get all these crazy new functions. Okay? And this difference comes completely from what instantiations you allow for the universal quantifications. This is a very important point on what it means to have polymorphism. More generally you want to know what kinds of quantification will lead to conservative extensions, and a conservative extension in my sense is something where you have more expressive types but you don't have new programs. You can't write programs that behave in a fundamentally different way. And this is often desirable because you want types to be more expressive but you don't want to be able to write programs that do silly things or wrong things or, you know, don't terminate for example. Okay, I'm going to talk briefly about dependent types because it's a notion that underlines everything that follows. A dependent type is a type which contains term level information. And I mean I hope this example is going to be sufficient to justify sort of the idea behind dependent types. Here I have a list with three elements. I would like to have a type that can express the fact that the list has three elements, and to express that fact it needs to contain a term. So VecN contains the term 3 which expresses that the List 1, 2, 3 has 3 elements. Okay? And we can quantify over these type level expressions as well. In fact you'd want a function reverse to work for any length vector so for all N it takes a vector with N elements and it returns a vector with the same number of elements. Okay? So you're going to be able to quantify that. And same question, you want to add these quantifications and you want to know if adding these quantifications is safe if you have no new functions or if you've created something fundamentally more powerful in the sense that you have different functions. And to answer this question in a formal manner, I'm going to present an existing framework that's called pure type systems that really allows us to express in a very finegrained way what the dependencies are, what it means when you quantify over a variable in a type. Are there any questions at this point? Okay, so a pure type system is a framework of many different type systems, and it's a generic framework that allows us to express typed programming language. But using the Curry-Howard isomorphism, you can also say it's a framework for expressing logics. Okay? So logics and programming language are sort of two different sides of the same coin. And one characteristic of pure type systems, what makes it a nice sort of playground for understanding type systems is that it only allows universal quantification. It doesn't have existential quantification. It doesn't have conjunction. It doesn't have data types. It just has universal quantification and universal quantification is logical notion. The corresponding programming language notion is that of dependent function space. Okay, so a few facts about pure type systems that sort of additionally justify the fact that we're interested in them: one, they're very expressive. You can actually find a pure type system that is so expressive that it allows you to express all of set theory, so basically all of mathematics can be expressed using a particular pure type system. Some pure type systems are just incredibly powerful specification languages. They're relatively well studied. They were invented in the 80's. And here I cite Barendregt but he wasn't the only inventor. But, this book is basically the Bible where he explains all the basic properties of pure type systems. And they've been studied relatively extensively since the 80's so they're a well-understood framework. And they're quite flexible; you can use them as sort of a core ideal -- I'm sorry. You can use them as the core of a functional language. You can say, okay, I take a pure type system and I'm going to use it to base a real world language like Haskell or a specification language like Agda or Coq. To do that I'm going to need to add some features but I'm going to start by using a pure type system and then I'll features to this system. So studying pure type systems gives us a starting point for studying these more complex programming languages like Haskell and Coq. However, the theory of pure type systems is something that can be really quite complex. In particular they were invented in the 80's but there were several open questions which is something that's quite unusual in type theory. I mean, open questions in type theory is kind of a surprising fact. There are many open questions in complexity theory but in type systems there aren't that many famous open questions. And then I named a couple but it's not very important what they are, but there are these two open questions concerning pure type systems. Okay, so our questions about adding polymorphism and understanding quantifications: can they be answered using pure type systems? This is a rhetorical question because I’m arguing that we can. But to give my argument I need to explain what a pure type system is and a pure type system is completely described by these three things. The first one is a set of sorts. Okay? Just any arbitrary set that we call the set of sorts, S. The second one is a relationship between elements of S that call axioms. And the third one is a ternary relation on elements of S that I call rules. And that's it; that's all you need to describe a pure type system. Now of course I haven't talked about type systems yet, but once you have this data you have everything you need to understand how the type system is described. I'm going to explain what informally these pieces of data mean. The elements of S represent a category of objects, a type of objects. Type is such a loaded word that I'm kind of reluctant to use it. But every element of S represents informally objects that are alike in some manner. This is kind of vague so I'm going to give examples. For example, star -- this star here -- as an element of S is often the symbol used to represent the category of propositions. So every proposition is in star, okay, is in the category star. And box is often used to represent the category of all types or of all sets. Okay? And yotta is traditionally used to represent the category of natural numbers. So yotta represents all the natural numbers. So every element of S represents, like this, a category of objects. And then, if S1 and S2 are in A which means that there is axiom S1, S2, this informally means that S1 is a member of the category S2. Remember that each elements of S represents a collection, a category of like objects. Well, sometimes other sorts can be members of that category. Okay? So if S1, S2 is in A then S1 is a member of the category S2. The third one is a little bit more complex but it specifies in which way we can quantify over parameterized elements of a category. So if I have an element of S2 that depends on a parameter of S1, I can universally quantify over that parameter and it gives me a result in S3. Okay? I'm going to give examples of this later but basically this says that if A is an element of the sort S1 and for each element of A, B of X is an element of the sort S2 then you quantify over all these X's and you'll end up in S3. Okay? So we write pi instead of for all, but that's just notation. >>: In your last slide, [inaudible] with S3, the last colon: where does it bind? It says the entire for all expression... >> Cody Roux: Yes. >>: ...is a member of...? >> Cody Roux: Yes. I've had this question before. I should reduce the space. >>: [Inaudible] >> Cody Roux: Okay. This is one object which is the universally quantified statement basically where for every X and A, B of X holds or an element of B of X. And all this is in the category S3 and sort S3. Yeah, sorry about that. My kerning is poor. Okay, so I write pi instead of for all. That's just a tradition. Okay, so given a PTS P I'm going to introduce formally what the type system associated to P is. I said P was sorts, axioms and rules, and now I'm going to say what the type system associated to those sorts, axioms and rules are. And that's really what we're going to study. We don't care about the set, the axioms and the rules; we care about this type system that's determined by those things. So the first one just says that if S1, S2 is an axiom then we can derive the judgment S1 is of type S2. This is unsurprising. That's exactly what I said S1, S2 and A meant. And the second one is this more complicated statement about rules which says that if A is of type S1 and if B is of type S2 under this assumption that X is of type A then pi X A,B is a type which is in the sort S3 if these three types are in our rule. Okay? So these two rules sort of tell us how to build these basic types which are the pi's and the sorts. And once we have that we can build terms, and there are only three ways to build a term which is either it's a variable in the context with this first row Var or it's an abstraction that builds a pi, that's builds an elements of the pi type. Or it's an application where I have an element of the pi type, I have a term that's of the domain and I apply the function to the term. So this is a way of constructing pi's and of destructing them. And you can see that this looks like the universal quantification rule. If I have for all X and A, B; well then, in particular B of U holds, okay, if U is of type A. This is really like universal quantification. And this is similarly if generically for an arbitrary X and A, B holds then for all X and A, B holds. But it's also like a function type: if given an X and A I can build an element of B then I can build a function that takes an element in A and returns an element in B. And this is the same. If you ignore this, if you have a function from A to B and an element of A, you can just apply it and get an element of B. So that's all there is to it. This is completely what pure type systems is except this conversion rule which says that if two types are equal in some sense then any element of the first type is also an element of the second type. And equal in some sense means equal with respect to this computational rule which is just the ordinary beta reduction rule where you have lambda applied to a term just is equal to that term where you replace the variable by the argument. We have four things really. >>: Can I ask a question? >> Cody Roux: Yes. >>: How come you have a kind of a typed beta reduction, why do you require A prime to be sorted as opposed to getting it as a property of beta reduction, that it preserves sorts? >> Cody Roux: A beta reduction preserves sorts but beta expansion doesn't. So this is kind of a sanity check that says I haven't expanded things and introduced some well typed creature that would then disappear when I beta reduced. There are some versions of pure type systems that don't have this requirement, but in general this is mostly a sanity check. We say... >>: Even beta expansion can be designed to be well typed, right? >> Cody Roux: You could limit that expansion to only well typed beta expansions, but that significantly complicates the theory of pure type systems. [Laughing] I mean one of the opening conjectures about pure type systems is that this untyped conversion here and the typed conversion lead to the same underlying systems. It's completely nonobvious. And it's quite surprising because it seems obvious. It seems obviously true that if you limit these conversions to well typed conversions, no harm should come. But it's actually very hard to prove. So I'd say don't concentrate too much on this. I've added it for convenience but there are technical reasons for why it's there. Does that answer your question? We can come back to it. And this is basically all there is. There are only structural rules that I've omitted that I call the boring rules. But this is all there is to it: just sorts and pi types, abstraction, application, variables of course and then this conversion rule. And that's all there is to it. That's a very simple type system but it allows us to model this vast array of different programming languages and at least what it can do is the simply typed lambda calculus. So one way to model the simply typed lambda calculus using this framework is to introduce these two sorts, yotta and star. This is going to represent a base type, say, of natural numbers. And this is going to represent the sort of all types. And then, the type of natural numbers of course is a type so it's in the sort of all types so we add this axiom. And now this rule says that if we -- I'm sorry -- if for each element of a type we have the element of type then we can perform the abstraction and it gets a new type. I'm sorry. This was poorly explained. This rule says that if given the elements of a type we can form a type then I can build the abstraction that takes an element of this type and returns an element of this type. And this still lives in star. Okay? This is basically the rule that allows us to perform arrow types like this. Okay? So I can form this identity function on yotta by abstracting over X and returning X. This is of type yotta arrow yotta where you yotta arrow yotta is just pi X of type yotta, yotta. And the fact that we can form this pi type uses this rule. So we have the simply typed lambda calculus with this rule because we can form these pi types using this rule, and these pi types correspond to just function types. All right? Now I'm not going to go into too much detail but I do want to stress the fact that using this framework of pure type systems you can build these very, very rich systems. So I just showed that we can build the simply typed lambda calculus. You can also build the simply typed lambda calculus where instead of having a single base type you can declare these abstract base types in the context. And this is just presented in a slightly different way where we have the type of all types has itself a type. And that's the only difference. And in this second version basically you can declare abstract types that you call ABC and all these are type variables but you can't quantify over them. The next system, often call it star colon star or type colon type. And it's the same at a simply typed lambda calculus except star is of type star. The type of all types is a type. And it seems relatively innocent; I mean it's the same as the previous one but it just has this different axiom. System F which allows general type quantification, it looks like the simply typed lambda calculus again except it has this rule which allows us to quantify over general types. So we can form this polymorphic identity rule where we say for every type X, X over X is a type. And we use this rule to form that quantification. And then, the calculus for constructions additionally adds these two forms of quantification. And these correspond intuitively to type constructors like list which is a constructor which takes a type and returns a type and dependent types which are types that can depend on values. The values here, there's the type. And you say, well you're allowed to have types that depend on values using this rule. Okay? So all these different rules allow different kinds of function bases to be built. This is the ordinary function space of functional programming languages. This is polymorphism. This is type constructors and this is dependent types. So we have this nice little picture where every rule corresponds to a kind of typing rule we want to allow of type construction. And then, this U minus is mostly important for historical reasons. I just wanted to show it because it's very similar to system F but it also allows polymorphism at the kind level. This is kind polymorphic. And then, CC omega is basically the core of coq. And it's a little bit more complicated because it has this infinite set of sorts. It has star and box i for any i and natural number. And this has a bunch of rules where you say star of type box i for any i and box i is of type box j, for any i and j such that i is smaller than j. Okay? And it has these complicated rules which are a generalization of the rules for the calculus constructions. It has this infinite hierarchy of sorts, and this forms the basis of the calculus constructions. This mainly serves the purpose to show that we can express all these really powerful or really interesting type systems just by a very simple set of sorts, rules, rules and axioms. Now normalization in these systems is a very... >>: [Inaudible] If you had box subzero is that the same as star? >> Cody Roux: I'm sorry. What line? >>: The last line. What if -- So you have a box sub i. >> Cody Roux: Right here? >>: Yeah, point to it. It's left. For example. Or any one of those boxes, right? >> Cody Roux: Yes. >>: My question is, is box zero the same thing as star? Or are they different? >> Cody Roux: No. Box zero is just right above star. Yeah, it's star then box zero then box 1 and box 2. That's sort of the mental picture you need. >>: I think you can explain the box i star, star. This rule shows that the star has a special status, right? >> Cody Roux: Yes. Yes. Yeah, okay, yes. If your point was, is this star redundant now that we have all these infinite boxes, the answer is actually no because of this rule. And thank you, Leo, for pointing that out. >>: I guess also if you had box of zero equal to star would that mean that -- Oh, it's strictly less than. Okay. I was wondering if that led to inconsistent with star colon star. >> Cody Roux: Well, I haven't said that star colon star is inconsistent yet. But, yeah, it would. You'd basically contain this system. But, yes, star has a special status. You have box i star star, whereas here K has to be bigger than the maximum. Okay? So if star was box minus one, say, this rule would be violated. Okay, what it does mean for a pure type system to be normalizing? It just says that if you're well typed using this type system corresponding to the pure type system then you have a beta-normal form. That's what it means to be normalizing. And normalization is quite nice because first of all it ensures decidability of type checking which is the least you could ask of a type system is that you could decide whether a term has a type or not. And normalization gives you that guarantee. It allows you to compare terms which is necessary to be able to perform type checking. And the second thing it implies is if you view the system as a logic, if you view this pi quantification as a universal for all and you think about these types as propositions then normalization implies consistency of this logic. It implies that obviously false types are uninhabited or that not all types are inhabited which is in general what consistency of a logic means. Not all propositions are provable. Normalization gives you that property because you can look at normal forms and you can prove that some types cannot possibly have an inhabitant in normal form. So normalization is just a really important property that you'd like to be able to guarantee for a few types systems. The thing is it's really hard to predict. I took all the type systems we had previously with these axioms and rules. And simply typed lambda calculus is normalizing of course in the two versions. But then you have this sort of similar system where this is the only difference and all of a sudden it's not normalizing any more. Now in retrospect you could say, well, it's obvious because a type is of type itself and so there is this kind of circularity. But I can assure you that it's not obvious to find a counter example that actually is non-normalizing. So this came as kind of a surprise. System F which doesn't seem much simpler than star star is normalizing and so is the calculus for constructions that looks really complicated. This U minus it's just a little bit more complicated than system F. You only allow this extra polymorphism over kinds. This is not normalizing. Okay? This also kind of came as surprise. And this crazy system that has this infinite tower of sorts, it is normalizing. So it seems like there's this sort of random jump where you have normalizing sometimes and not normalizing sometimes, and it's very difficult to predict which one it's going to be. One thing that was unclear last time I gave this talk: it's unknown whether this problem is decidable or not. Nobody knows if there's an algorithm where you put in a type system and it outputs yes if it's normalizing and no if it's not. This is just a very hard problem. So how do we attack this problem? The mathematician in me says, well when you have a hard question, you have to decide to not answer it. Okay? You say, okay, this question is too hard. I'm going to ask a different question. And hopefully this different question is going to shed some light on the original hard question. So the different question I ask is given normalizing PTS's what are the operations that preserve normalization? What can do to a PTS, how can I modify a PTS in such a way that the resulting PTS I have is still normalizing if the original ones are? And this is what I call the study of the structural theory of pure type systems. I want to examine the set of all pure type systems and understand the structure of this large set. I want to understand the interaction of how you construct new pure type systems in ways that preserve normalization. Here's a PTS that I call MinLog because it's minimal logic, minimal implicative logic. So it's a very, very simple logic where I just have a sort of a propositions and then I have this sort of worlds which is just a sort containing the sort of all propositions. And I have this rule that says you can build new propositions as implications. This rule allows me to build implications. Now you may notice that this is just exactly the simply typed lambda calculus from earlier, but it's seen as a logic. So I renamed the sorts so that it was more apparent because this is a logic, okay, which is just minimal implicative logic. So it's a very simple logic and in particular since it normalizes it's a consistent logic. You know that this logic is consistent but it can only express implications, which is not that fascinating. I mean first-year logic students can easily understand this. We want to examine terms now, so we build this new PTS which contains this sort which is the sort of all sets. I'm going to look at sets of terms and I need a sort to classify these sets of terms. Now I have this other sort that's going to allow me to build term constructors, functions that from terms build other terms. Okay? So set is of type Univ which allows me to declare set variables, say suppose we have a set A. And these two rules which seem more complicated than the previous ones are actually simpler. I say that if I have a function from set to sets that lives in the new sort Fun, okay -- So once I have a function, the function space is not a set anymore. So I can build the functions but I can't keep iterating this which means I only have sort of first-order terms. I can build functions -- Okay, this allows me to build functions with several arguments. But these two rules only allow me to build first-order functions, functions that take a number of arguments in a set and that return an element of another set. So this is a really nice PTS in that it only allows me to build first-order terms. Okay? I have a PTS with simple propositions. I have a term language with first-order terms. Now what do I want to do? I want to build a new PTS that allows quantifying over these terms using the simple logic MinLog. And to do that I build a new PTS which just takes the sorts of MinLog and Term, puts them together and adds a new sort, Multiverse which is unrelated to all the other sorts. And I keep the same axioms and rules so I still have my terms on one side and my propositions on the other side, but now I'm allowed to make propositions depend on sets. Propositions that depend on sets still are propositions. And this allows me to form universal statements about terms. Okay? And these two rules -- Wait. What is worlds already? Ah, yes. These two rules allow me to quantify over all propositions. I can quantify over all propositions using this over a proposition, so I can say for all P, P implies P for example. But the resulting proposition is of a new nature. It's this conservative extension I talked about earlier. I want to be able to quantify over all propositions but then I want this to be a higher level proposition, a new proposition. >>: [Inaudible] worlds from the [inaudible]? >> Cody Roux: Yes. It's the type of prop. >>: Okay, I see. >> Cody Roux: So if I quantify over all propositions then the sort of that type is worlds. Okay? So now these two rules allow me to quantify over all propositions. And this rule allows me to quantify over terms to get a proposition that talks about all terms in a set. >>: And the last one is just giving you like a [inaudible]? >> Cody Roux: Yes. This allows me to quantify once over all propositions and this allows me to quantify again over all propositions. So usually these two rules come together. >>: And sorry one more question. Why did you -- I mean if you didn't know anything about what these sorts actually meant semantically, you could try other combinations. And in prop set prop and that may not make much sense but just syntactically applied. >> Cody Roux: Yes. >>: Did you design these particular constructions because you were aiming for firstorder logic? >> Cody Roux: Yes. >>: Okay. >> Cody Roux: Yes. But I'm going to show in the next slide that there's a theorem that allows me to say things about these special rules. So it's a theorem that takes these rules and says, "Oh, well they satisfy a certain criterion." And criterion they satisfy in particular implies that my new PTS is normalizing if and only if the old PTS's were normalizing. Okay? So these rules have a shape that is simple in some sense. It doesn't add anything to the pure type system. >>: So the first rule I see is having some interaction between the two PTS's, right? >> Cody Roux: Yes. >>: The next two rules are all from compositions of sorts from just the first rule, the first PTS alone. >> Cody Roux: Yes, that's true. But in some sense these rules allow more expressivity about propositions, but they don't allow us to prove more propositions. It's a conservative extension. So all the propositions in the original logic that were unprovable are still unprovable. I can't prove any new propositions. >>: So could I have started with the simply typed lambda calculus and forgot about the terms and just taken your second and third rule? >> Cody Roux: Yes. That would've given me polymorphism. >>: Okay. That would still have been -- You're going to tell us about a theorem that would ensure that that system is still [inaudible]? >> Cody Roux: Yes. So this answers the question I asked at the beginning of this slide which is how do we add polymorphism in a safe manner? And this is the answer. I mean, these two rules add polymorphism in a safe manner. And since they satisfy one of my theorems, you're guaranteed just syntactically that you can't break normalization. Yeah, so in more detail I'm explaining that this rule -- set, set, prop -- allows building propositions that depend on terms. So if I have a set variable A and I have a proposition that depends on A, I can form this because of my set, set, prop rule. I'm sorry, no. Yes, I can form this because of my set, set, prop rule and I can form this universally quantified statement which says that for every X and A, P of X implies [inaudible]. And I can ever prove it but that's not the important part. The important part is being able to express statements that depend on terms. All right? So my set, set, prop allows for these kinds of propositions. And the worlds, prop, multiverse allows us to additionally quantify over P, for example. So I can say for any P, which is a predicate on the set A, and for any X and A, P of X implies P of X. And this is of type multiverse, okay; this is not a proposition. It's something that lives in a higher world. >>: Did you have set, set, prop? You had set, set, fun and then set, prop. >> Cody Roux: Oh, yes. That's a typo. I'm sorry. >>: Okay. So set, prop, prop. >> Cody Roux: Yes. >>: Okay. >> Cody Roux: Darn. I missed that one the first time around. Yeah, yeah. Set, prop, prop. So you see this is a set and this is a proposition, and the result we get is a proposition. Okay, so the theorem is if MinLog and Term are normalizing then this new FOL PTS I've just constructed is normalizing. Okay? And in fact we can additionally show the FOL is a conservative extension which means that there are no propositions which were unprovable in MinLog and that are now provable in FOL. Okay? We can express new things and prove them but certainly the old things haven't changed. There are no new provable propositions from MinLog. All right? And in general -- So I've basically said what I wanted to say, but I am going to show how we express formally this theorem. And the formal expression of the theorem basically says I can take two PTS's, P and Q, and I can take their disjoint union, okay, where I just take the sorts and I suppose that they have zero intersection. I just put them together and I get a new PTS. And the first theorem is if you are typing this new PTS then you're typed in one of the old PTS's. But this theorem is a little bit subtle to express because this context could be mixed. It could be a mixed context that takes some types from P and some types from Q. You have to sort of filter out the unwanted contexts. So this proposition is actually nontrivial. And this theorem that says that in the disjoint sum -- if you're typable in the disjoint sum then you're typable in one of the two. It implies normalization of the disjoint sum. Okay? So forming disjoint sums is safe. But I've done two things: I've formed the disjoint sum and then I've added these extra rules. And these extra rules I've added are all of the form either SKK, like the set, prop, prop rule I added where S is one of the PTS's and K is in the other one of the PTS's. The intuition is that these rules if we add them for every sort in P and every sort in Q, creates the pure type system which is the Q logic of P terms. P was the PTS that allows us to form terms and sets. And Q in MinLog was the PTS that expressed minimal logic. And so by adding these rules we've formed a logic which talked P terms which had propositions from the Q logic. And the second thing we did -- I'm sorry. So the theorem is that this new PTS with these extra rules is normalizing if and only if both PTS's are normalizing, so P and Q are normalizing. And the second thing we did was allow quantification over all propositions. And to do that we added this new special sort that I called multiverse that here I call S triangle K. And then, I added these two rules, okay, which say I can quantify over S for any K but I have to bump up the result. I have to end up in this new sort. And this allows me to keep quantifying over S, to quantify several times over S. And I call the resulting PTS P hat. And the intuition is that you can quantify over S parameterized K's using these rules. And my result, again, is that if I add all these extra rules I'm normalizing if and only if I was originally normalizing. So these are two methods to combine pure type systems while preserving normalization. Okay, and the proof: I'm not going into it but it involves identifying unique sorts associated with each redexe. And the sort is going to classify the redexe, and we need to sort of consider each redexe individually. We can erase all the redexes that come from one PTS and just examine the term cleaned of all these redexes. And on the other side we can just examine each individual redexe from the other PTS. And then, we need a commutation result which says, well some reductions can be done in one PTS and they won't interfere with any of the redexes done in the other PTS. So this is very combinatorial proof and it uses ideas from Bernardy and Lasson who have this similar goal of trying to enrich PTS's so that they can express more things. Okay, wow. Yeah, I'm pretty early. I guess nobody is going to complain about that. So pure type systems can be used to ask and to answer questions about quantification. The main question we asked was what kinds of quantifications can we add to a logic without risking inconsistency or without making it more powerful in the sense of more things are provable, just more powerful in the sense more things are expressible. My second observation which is kind of I feel the real message I want to get across is that it's interesting to study normalization preserving extensions. Often we take a PTS and we try to prove it's normalizing sort of independently of everybody else. We sort of put it in a black hole and we study just that PTS. But what I'm saying is that we should study combinations and interactions between PTS's. And this is an interesting field of study because it allows us to build richer type systems. We were just talking about F star yesterday. An F star can be seen as a very rich type system which is made of these different components that we put together in a larger system which is very expressive. Okay, so I've showed that certain rules can be added safely. I'd like to be able to say, okay, using these proof techniques, these are all the possible rules which can be added. I think there are more rules that are safe, that can be proven safe but I haven't characterized exactly which rules can be added to a PTS. And the question here is can we take systems that are sort of built like this, that sort of have this term component and this logical component, and can we simply for example consistency proofs using this approach. Often you want to show normalization of the propositional side but you have these terms that come in and so you're scared they interact in some manner and you have to show that they don't. I'm wondering if this result is general enough to be able to say, okay once and for all this is what you need to show to be sure that there's no interaction between the logic and the programming side of your type system. Okay, so of course an obvious extension is what happens when we add inductive types? What happens when we add existential types? What happens when we make our type system more complicated? What point do we lose these nice results? And finally this is more speculative: it would be nice if we had a proof that was less combinatorial. Often there are proofs of conservativity, there's a combinatorial version which is very hands-on and then there's a semantic version which is a little bit more abstract but often much more succinct and more powerful. So I'd like to be able to understand a model theoretic view of these conservativity results. That's all I had to say about this, so thank you very much. These are references if you want to see them, and I'd be happy to answer any questions. [Applause] >>: Have you taken your theorems through some proof of system? >> Cody Roux: No. There are parts of the theorem which are reasonable straightforward which are just rewrite theory, but there is a part that is quite technical. In particular this disjoint sum separating, it's quite technical. So it would be worthwhile effort to prove this theorem formally. I'm afraid I don't have much experience with proving formal statements about programming languages and coq. But, yeah, it would be a worthwhile effort. And it's not sort of unfeasible because there has been a lot of work on pure type systems already. >>: [Inaudible] but there are other systems [inaudible]. >> Cody Roux: Yeah, there are other systems but pure type systems have been studying coq. I'm sure of that. I don't know if they've been studied in other things. Probably Agda, I don't know. [Inaudible] maybe not because they're not as interested in dependent types but maybe there are things in [inaudible]. >>: Quick question: so have you see some people who are working on type systems that -- So the system for example got Trellis which chooses to include star, cone, star, letting parts of the system be inconsistent but tries to isolate within that system a core that remains consistent. Have you thought about how you might try to do that in your setting? Like have some part of the system that is [inaudible] crazy, but you have some theorem about some subset of your type system? >> Cody Roux: Yeah. So Trellis is interesting because it was part of the motivation for this result because Trellis has some non-termination, for example. But that's the programming part. And then, you want this logical part that's terminating and you want to say, oh well there's this separation. I mean, you want to have this system and you only have Q that's normalizing. Okay? But we can show that if Q is normalizing then in this system if you have someone that comes from Q, it has a normal form. And so that's one of the motivations I had. You want to show this in Trellis and it's hard. But then there's additional thing where they have this inconsistent type system but they say, "Okay, if after the fact you show that you haven't actually used this rule then you're okay." And I haven't really thought about that but in general it's just for convenience in some sense. I think if they could add universes in a simple extensible way they would've added universes. And they just say, "Okay, we put type, type and it's convenient. But someday we'll add universes and we won't need type, type anymore." >>: I'm not sure I think of it only as a convenience. I mean it's also an expressiveness thing. You get to write non-terminating programs because you may want to. >> Cody Roux: Yeah, but adding type, type... >>: [Inaudible] same properties about those things in a consistent way. >> Cody Roux: I agree. And you want the non-termination to be in your term language and not in your proof language. And, yes, this approach is sort of has the ambition to be able to analyze these types of systems. But I think type, type is sort of beside the point. When you add non-termination you want to add a fixed point operator. You don't want to add type, type because it's an open question whether you can actually write a fixed point combinatory in type, type. You can have non-terminating terms but you don't have a fixed point combinatory. So type, type is kind of an accident. It's just a convenience. It dispenses you from having to think about universes. It makes everything simple. It just makes things inconsistent which is kind of an inconvenience. But, yeah, you want nontermination. You want fixed points. And here you definitely want termination because it's a logic. And this work is definitely aimed at saying things when P is non-terminating and when Q is terminating. I haven't talked about this because it's a little bit less pretty to express. But you can have a theorem to that effect. >>: You do have such a theorem? >> Cody Roux: What? >>: You do have such a theorem like the case where P is... >> Cody Roux: Yeah. But you can't just for all P, Q is normalizing. You'd have to say for all terms that are of a sort which comes from Q, that term has or had normal form. So it's a little bit less elegant. >>: Okay. >> Cody Roux: Well, thank you. [Applause]