>> Nikhil Swamy: Okay, well thank you for coming... Jonathon Protzenko visiting us for the next couple of days...

>> Nikhil Swamy: Okay, well thank you for coming this morning. It's my pleasure to have Jonathon Protzenko visiting us for the next couple of days while interviewing at the [inaudible] group. Jonathon is going to be telling us about some work he's been doing on Mezzo, an ML-like programming language with a type system that's inspired by separation logic. And what's also cool is that a Jonathon is a bit of a doppelganger. He's got a second life as a web developer doing open source development on various Mozilla projects. And if I understand correctly, he maintains one of the extensions for Thunderbird that's got more than a hundred thousand users or so. So, that's pretty awesome. Looking forward to hearing about both these things; maybe more about Mezzo than about the other. Welcome. >> Jonathon Protzenko: Thanks, Nikhil, for the introduction. As Nikhil the talk is going to be about my research. It's mostly going to revolve around Mezzo, the language that I designed during my PhD. There is going to be three main axes to this presentation: first of all I'm going to tell you about the design of it and then how we implemented it. And then, I'm going to tell you about the user interaction. And that's where the visualizations and all the fancy [inaudible] are going to come in. So quickly a word about myself: I flew in from INRIA back in France and I'm in the Gallium team which you certainly know for some of the famous projects, OCaml or Compcert. My advisor is Francois Pottier and as Nikhil said I happen to be a Mozilla contributor. My PhD was really Mezzo the language. It's been all about designing it. And what we've been trying to do is provide language that's introduced in the tradition of ML but that goes further using the type system. It goes further. It provides more guarantees. About three buzzwords if I may say so which are state, ownership and aliasing. So hopeful it will be more clear by the end of the talk what I mean by this but with it key points that I'm going to cover, among other things. First of all I'm going to tell you how we designed the core of Mezzo, how we combined various ingredients from the literature to create what we believe is an interesting core calculus. Then, we'll come to the problem of complex aliasing. You have an arbitrary graph of objects, pointers in every possible direction. What is our strategy for dealing with that? I'm going to tell you a little bit about the implementation since that's one of the major fragments of my PhD. Why was it hard implementing Mezzo and how is it done? And then, I'm going to tell a word about visualizations. So really the talk is going to be sprinkled with demos and visualizations, and hopefully that will make it more interactive. Really, if there is anything that's unclear, I really do wish to be interrupted to make sure that no one lost. I think it's going to be better for everyone if you ask questions in the middle of it. So I like to start with this talk of a small example of what I think is a bad programming language: that's the control equations for Apollo 11, the lunar landing. The comments are really interesting. It reads: Temporary, I hope hope hope. And they landed people on the moon with that. So there is... >>: Where did you get that? >> Jonathon Protzenko: It's actually on Google Code. There has been a release of all the old documents, so you can actually read the original control programs. [Inaudible audience comments and laughter] Yeah, so back then there were some pretty bad programming languages but... >>: But there is another way of looking at it: it means that clarification is not needed because we managed to land on the moon without it. >> Jonathon Protzenko: Right. >>: With such a terrible programming language. >> Jonathon Protzenko: [Laughing] I wouldn't want to fly to the moon with that program. So people quickly realized that we needed better programming languages. A guy called Knuth said, "I think we're on the verge of discovering at last what programming languages should really be like.... By 1984 consensus for a really good programming language." Right. So it turns out that's not quite the story. That's an example that I kind of like. There's a variable which is equal to itself and also equal to not itself. What I won't tell you is I don't want to bash any language. What I want to say is the quest for better programming languages goes on. It is still an open problem. And why do we want better programming languages? We want better programming languages to better reason about the programs. If you're landing a guy on the moon, you want to have maximal confidence that the program does what it intends to do. And one way to attain that goal is to have better programming languages that, from the start, rule out more bugs. And that's what I've been trying to do. The way I've been trying to do it is through type system. Type system is [inaudible] confidence for my team and really we wanted to go to the next step. There is the famous Milner motto: well-typed programs do not go wrong. We really want to take it to the next level. And the way we did it through a type system there is, in our opinion, a lot of advantages to doing it through a type system. It's part of the workflow. Every time you program you've got to think about your types. If you want to port a program to a better type system, you can do it incrementally. You don't have to rewrite the code; you can just improve on the types. It's one layer. You don't have a verification layer on the one hand and your program on the other hand. And also what we are really trying to achieve is compromise. We don't want full specification; we just want a better type system that gives you more guarantees. And really the end hope is that programs that are written in Mezzo, because they've been written using that fancy type system are more structured meaning they're more likely to be proved if someone wishes to do so. So Mezzo feels like ML, really the syntax looks like ML. It's very familiar but type system is close, I would say, to separation logic. It allows you to talk about state: is my socket initialize already or closed? Ownership: I have that piece of immutable data, which thread owns it. And aliasing: I have many, many pointers. How do I manage that? And, again, I really wish to stress that Mezzo is not the ultimate answer. It is not going to provide you with a complete specification of your program, just stronger guarantees while hopefully remaining somehow manageable. The reason we're doing that, and I really should clarify how we stand with regards to it, is concurrency. Really the end goal is to have a better type system for helping you write concurrent programs. The initial effort was aimed at sequential programs because there are a lot of things that you have to sort out before you move on to talking about concurrency. Working on sequential programs has allowed us to incorporate new idioms to understand more stuff. It allows us to talk about effects and then, it allows us to gain that guarantee. One of the main theorems that we have is that programs in Mezzo are datarace free. So what we have now is a lot of modules that are formalized in Mezzo for locks, concurrency, threads, channels; however, the main caveat is that we do not have a concurrent runtime. I just want to be careful about that. We're using OCaml's model which is not parallel. So the future may be to move to a better runtime. We set the mechanisms for talking about it but we don't have parallel execution in Mezzo as of now. Before jumping in right to the examples I just want to tell you about a few sources of inspiration that we had so that hopefully... >>: Can I just ask a question about the previous slide? >> Jonathon Protzenko: Yeah, sure. >>: So you're saying that you support threads but they can run only on a single core. >> Jonathon Protzenko: Yes. >>: So you will make sure that the operating system doesn't [inaudible]? >> Jonathon Protzenko: No, what I mean is that we have signatures for the spawn function in our type system. We type check the spawn function with a fine-grained type that tells you what he thread modifies in the heap, etcetera. But ultimately there is only one thread that's running in the runtime system of OCaml. We do not have multiple threads; it's just sequential. Programs are compiled sequentially. Actually I think the thread module does not even compile. We cannot use it. I mean there is a type-checker interface for it. It's formalized, it's modeled, but we do not have any support for running threads right now. >>: [Inaudible]. >>: What? [Laughter] >>: No, I don't understand. >>: So I guess the question was, are you – you said not truly parallel but you could be supporting switching between threads on a single processor. >> Jonathon Protzenko: Yes, correct. We do not even do that. >>: You don't even do that. >> Jonathon Protzenko: Our effort was really in the design of a language not in the implementation of a runtime system. >>: So if I have a blocking primitive, like I don't... >>: Would that block a program? >>: Would that [inaudible]? >> Jonathon Protzenko: It would be just like – What kind of blocking primitive? >>: Like suppose I acquire a lock. >>: A lock. >> Jonathon Protzenko: I don't think locks are implemented right now. I mean, they have signatures. You can write a program that has locks... >>: Okay, so there's no... >> Jonathon Protzenko: It's very primitive. I mean we have a compiler just to say that we can run basic programs that do print. >>: Yeah, I get that drift now. >> Jonathon Protzenko: Okay. It really is about the design of a type system not – Okay. >>: So there's nothing deep to extend. You could in principle with some engineering have a runtime that does this stuff? >> Jonathon Protzenko: Absolutely. I mean it's just some missing manpower. We just need to fill it the gaps in the compilation tool chain. >>: [Inaudible] will be fun with this. >>: So you use the OCaml runtime in Mezzo. >> Jonathon Protzenko: Yeah, the way we do it is that the compiler translates Mezzo code to OCaml code. >>: But Mezzo code has nothing to do with OCaml libraries? >> Jonathon Protzenko: No. I mean, you can interact between the two. You can call Mezzo code form OCaml but, again, our focus has been on the design of the language and the proof of soundness. Right now the compiler outputs an OCaml file with several calls to [inaudible] because the type system of Mezzo is more powerful. And then, you just compile it and run it. We have some very basic sample programs. Any other questions? Okay, so let me just give you a few sources of inspiration to make sure that we can locate Mezzo in the design space. Mezzo is influenced by a lot of existing work. Of course there is the linear-lambda calculus which has that idea that some variables may only be used once and that way you get a predictable de-allocation and that's an efficient way to model system resources. So we do have that notion also in Mezzo that some variables can be uniquely used or have a unique order. Another source of inspiration is alias types. Maybe in my opinion the main idea of alias types is that you want to keep track of aliasing relationships in the type system. You want to remember that this field is this object and these other fields point to the same thing and that's included in the type system. And we have a similar mechanism in Mezzo. Separation logic is probably our main source of inspiration. The fact that we have a store that can possibly separate distinct fragments of the heap, the fact that you can have local reasoning and a function, and you know that whenever it touches its own fragment of the heap it will not interfere with whatever lies everywhere else. The frame rule. These are things that in a way appear in some way in Mezzo. Another source of inspiration has been object-oriented languages; here is that Plaid family of languages where people have a language that feels like Java that has a type system. And on top of that they have a reasoning about permissions, and they want to model object protocols like iterator, has next, ready or close. So they have state diagrams and they want to model that. And we also have something similar in Mezzo. And finally a paper that I quite like is the one about ownership types. There are a lot of great ideas in the paper: the fact that there is an ownership hierarchy that objects may own others. And also an essential ID that we have is that you may have a pointer to an object but you may not have ownership of it. That's something I will elaborate on in this talk. That's pretty much all for the intro about the language. I'm going to directly jump into some demos. There are going to be several self-contained examples. I'm going to switch back and forth between Mezzo and this presentation, and I'm going to do some quick recaps. If anything, the syntax or what I say isn't clear, please do interrupt me. I'm going to start with a usage protocol which I call the writeonce references. Say you're parsing your common line options. You have an initialized memory. You're going to initialize it once and never after again are you going to modify it. And say you want to encode that in the type system. You start with a new write-once reference which is in the state Writable. Then you seal it. So seal has two effects: it writes something to the reference and then it makes it immutable so you can no longer write into it. And then, once in the frozen state you can call get repeatedly which keeps you in the frozen state. So here is some code that does that in Mezzo. It constructs a value which is unnamed and which is a pair of integers by allocating a new write-once reference, sealing it and calling get twice. So how do we reason, type system-wise, about that example? Type checking in Mezzo is flow sensitive, so we start here and we say that we have the [inaudible] permission. A permission in Mezzo is something that has this form with an "at" sign. For instance if you call new you get a new permission for r which tells you that r is type writable. So permission conveys two types of information: it tells you about the shape of the object memory, r is a writable memory cell which has a [inaudible] for instance. And it also tells you something very important; it tells you that because r is writable, you have ownership of it. You own the fraction of memory that is pointed to by r. How you own it is going to depend on the nature of the type. I will get back to that later, but if you have that permission you have a certain ownership for r. What happens is that you call seal. We'll see in a minute the signature of the seal function, but it expresses the fact that it needs to have this as a writable and it transforms it into something that's frozen. The signature of seal expresses that. Actually the permission disappears and another one appears. So really permissions, they come and go and they may change through execution. Permissions there are completely in the type system. They do not exist at runtime. It is our type system. We don't have a separate judgment with [inaudible]; this is what we use for our type system. Is that clear so far? Okay. I think I have a demo. Okay, so can everyone see? Let me click on writeonce references. I am going to clear this. That's Mezzo in a web browser. I've compiled Mezzo to Javascript and it type checks in your browser. I'm not connected to the Internet so it's not sending anything offline for some other server to type check. So it may be a little bit slow because it's Javascript, but it's still fairly acceptable. That's the implementation of write-once references. Basically we're – Yeah? >>: Can you use the pointer on your PC rather than [inaudible]? >> Jonathon Protzenko: Sure. >>: Thanks. >> Jonathon Protzenko: So that is a definition of a data type. That's the writable data type. It only has one field for contents which has type units, so it's just a place holder value. It's mutable because there is the key word and then, the frozen type also has a contents field except the contents field has type A. And because there is no mutable keyword, it is immutable. The new function, well it just returns to you a new writable with a placeholder value in it. And the interesting bit here is the seal function. The seal function tells you that it takes ownership from its caller of a thing called r which is writable. It also takes ownership of x. So that's the special [inaudible] keyword here. It tells you that you're stealing from your caller the two permissions and in exchange you're returning a new one to your caller. You're returning to your caller a permission for that variable r and you're returning to it the fact that it's now frozen. The function operationally does two things: it writes x into the contents field and it writes into the tag of r the frozen tag which basically amounts to telling the type system that the thing is now frozen. And finally the get function, it just takes a reference r and it returns a pointer to the contents field. >>: Sorry, what's the pipe on line 13? >> Jonathon Protzenko: Yes, so the pipe is to separate – it's the conjunction of a type and a permission. So if you write t pipe p, it's a value that has type t along with a permission P. So if you have the unit value along with permission P, that's written like this and we somehow abbreviate it as this. So it's the unit value along with conceptually a permission. Any other questions about the syntax? Yes? >>: [Inaudible] you take the ownership. So if you do get, [inaudible] it seems like nobody else can get any more. >> Jonathon Protzenko: That is an excellent question which is going to be addressed in ref get fixed too. I'm going to get to that. We did not coordinate with Dan so that he would ask me the question. Yes? >>: Two questions: so in your previous slide you had writable and frozen and here you have writable and frozen where they're declared as types but they're not built into the language. >> Jonathon Protzenko: Yes. >>: So I guess that they were not [inaudible] refer to these types. >> Jonathon Protzenko: Yes, absolutely. We... >>: But what's built in is the consumes? >> Jonathon Protzenko: Yes, consumes is a built-in concept. >>: Now in the seal function you have r.contents and when you're passing r as a writable that has a [inaudible] in the [inaudible] field. >> Jonathon Protzenko: Yes. So let me show you... >>: Can you override it with an alpha? >> Jonathon Protzenko: Yeah, so let me show you... >>: So this wouldn't [inaudible] in the simple types, right? >> Jonathon Protzenko: Absolutely. There is actually a state change that takes place. The variable goes through different types. Initially we have r at writable. But the type system is aware of the fact that writable has only one branch, so it expands that as, what we call, a concrete type. It's basically a type that tells you about the structure of r in memory. That's a more precise type that tells you that there is a writable tag and then there is a contents field which has types in it. And the system knows how to deal with that in the sense that if you assign something into the contents field, it's going to keep track of that change by saying that the contents field has type a. So we are in an intermediary state. We have some [inaudible]. That type cannot be seen as a writable and cannot be seen as a frozen either. It's something in between. The type system keeps track of that. It allows you to temporarily break invariance. And then on the next step, you change the tag. So you just modify the writable portion of it and you write frozen instead. And then, the type-checking says the post-condition of that function says that I have to have r as a frozen thing. Is it actually frozen? Yes because this is a subtype of r at frozen a. So subtyping takes place and then you get what the post-condition of the function requires. >>: And the data declaration declares records that have tags. >> Jonathon Protzenko: Yes. >>: So the tag keyword acts as the tag of the record? >> Jonathon Protzenko: Tag of is a keyword, yes. And when we say mutable here, it means that everything, the whole memory block is mutable including... >>: With the tag [inaudible]. >> Jonathon Protzenko: Yes, absolutely. >>: Okay, so then frozen is not mutable? >> Jonathon Protzenko: So if I were to do another assignment here like tag of r gets mutable, that would be an error because the thing has been frozen. It no longer is mutable. >>: So the type system can synthesize these intermediary types that aren't expressed anywhere else in the program? [Inaudible]... >> Jonathon Protzenko: Oh, you can write that type. You can actually write. You can write that type. I could've said here instead of writable I could've said r writable contents unit. That's pretty much the same thing. >>: Yes, but when you assign something to the contents and now you have a writable with contents that's not unit, that's not described anywhere else in the program. It just synthesizes these on the [inaudible]... >> Jonathon Protzenko: Yeah, it is aware of that. It is aware of that. And you could actually write it here. Say, I wanted to give it a somehow different signature, let's say I'm going to do something stupid. I want to only initialize writables that originally contain integers who can write to that intermediary type. Okay. So quick recap. Permissions: they're available at each program point. They change as you step through the program. So type-checking is flow sensitive. The functions have in their signatures the effects that they perform and the permissions they are our type system. And here we've been talking about state so for us the state of an object, it's simply the type that it has. It may move from type a to type b through a functional call. Okay, I'm going to show you another example which is that of a race that is rejected by the type system of Mezzo. Whoops, not this. This. I have a variable r which is a regular reference, so I'm just calling in the built-in new ref. New ref has an implementation that just defines ref contents, so it's just regular references. And there is an incur function that mentions in its signature that it modifies r. Because there is no consumes keyword, it takes ownership of r and it also returns it. That's the syntactic convention. If you do not have the consumes keyword, you're just modifying your permissions. Incur does not take a pointer to r; it just modifies it. It closes over the pointer. It basically assigns r plus 1 into r. So let's say that I want to start two threads that only do incur. What do I have in terms of permission at that program point? I have r which is a reference, and I also have a permission for incur which is a big function type. I want to call spawn because incur requires the permission for r. After I've called spawn for the first time I no longer have the permission for r. So if I try to call spawn for incur another time that is going to fail because the required permission is no longer there. So let's try to type check this. I hit go. It tells me that at the highlighted location, I cannot obtain the permission for r as a reference to an integer. It tells me that there is only another lesser permission available for it which I will mention in this example. Okay. So how does that work? The system has a built-in notion of what is shareable and what is not shareable. For things that are mutable, it knows that they must have a unique owner because, say, if you modify a thing and you put into another state, if someone else still owns a copy of the permission that points to the older state, you're going to have an inconsistency. So anything that's mutable must have a unique owner. And the system is aware of that. >>: Is it possible to see the signature of spawn? >> Jonathon Protzenko: Yes, I suppose. Let's look up in the standard library. Not there. Let's look up in the core library. Yeah, here's the signature of spawn. It basically takes a function f that requires a certain permission p. The outer spawn function also takes, itself, the permission p and it runs f by conceptually passing it the permission p. You're basically expressing the transfer of ownership. Spawn takes from its caller p, transfers it to f and starts f on a new thread. >>: And incur actually – it doesn't quite consume a permission, right? It takes it and gives it back in some sense. >> Jonathon Protzenko: Correct. >>: And you have some kind of sub-typing [inaudible]... >> Jonathon Protzenko: Yes, I'm pretty sure that there's sub-typing going on where we're dropping the permission. The syntactic sugar without consumes translates to a permission that's consumed in your argument and returned, and you can just drop it in the code domain of the function. Okay. Yes, so there is a built-in notion of what is [inaudible] and what is not. Data may be frozen from mutable to immutable but of course not the other way around for that would be unsound. Let me just now move on to another example which is an aliasing violation. By the way there is a way to fix the race which is to use a lock, and I'm just going to run over that example quickly. But this is a higher order function that transforms a function f that has some internal state s into a function that no longer exposes its internal state by allocating a new lock. So what the function hide does is that it allocates a new lock for the function's internal state, and it creates a new function that has no internal state. First thing, it acquires the lock, calls the original function and then releases the lock. So it's hiding the internal state of a function in a higher order way. And if you call hide on incur, you're basically getting a function from unit to unit and that function you can spawn it twice. If I clear this, I think I can type check that example. Okay, so a bit – Yes, it type checks successfully in about three seconds. Okay. >>: For me that's strange because I don't think you really hide the internal state because it's observable. What you really mean is you kind of the permissions of that... >> Jonathon Protzenko: Yes. >>: ...such that you can use it in [inaudible] way. >> Jonathon Protzenko: Yes. >>: You're still guaranteed data-race freedom. >> Jonathon Protzenko: Yes, absolutely. That's what the fact that this program type checks means. Thanks for rephrasing it in a better way than I did. So that example should make you happy. That is the usually get function for references. If I try to type check that example, it's going to tell me that I have an aliasing violation. Why? Because I'm pretending that I'm maintaining the argument r as a valued reference but I'm also returning a pointer into the reference, so now I'm aliasing the contents field as a reference. I have one way to access it through r and also through the return value that's returned by the function. So that's an aliasing violation. How can I fix that? The first natural way to fix it is to require that the reference contained the applicable elements which I can then free alias. And then, this example becomes legal. But I can do something smarter. What I can do is use a special quantification. What I'm going to say is I'm going to return a pointer to the element x which happens to be whatever is in the contents field of the reference. So what's that bizarre thing equals x? That's a type which we call a singleton type. The singleton type equals x are the type of elements that are equal to x. So that's the only way we get dependent types in Mezzo. And what that tells you is that you just return a pointer to the element called x without any ownership information. So you are not attempting to duplicate ownership; you are just returning a pointer into it. The square brackets stand for universal quantification over x. So I'm going to run that example in the – Yeah? >>: The term is another type? >> Jonathon Protzenko: It's a kind. It's the kind of variable. So that is a type variable at kind term. So program variables: if you do let x, you can use x in expressions [inaudible] types with kind term. >>: Could you have written get square brackets a and x colon a? >> Jonathon Protzenko: That would've been embarrassing. No, you cannot do that. No, that wouldn't work because a would have to be a kind here and we do not quantify on kinds. You can write for all a but then a is a type. >>: Right. So you write for all a and then, you also write for all x of type a. >> Jonathon Protzenko: Like this? >>: No, just – I mean... >>: Just a second square bracket after the a; x colon a. >>: I think you said you don't support quantification over [inaudible]. >> Jonathon Protzenko: Yeah, we don't support that. >>: Oh, I was thinking that this is quantification over types. >> Jonathon Protzenko: No, x is already a type variable of kind terms so that really means a of kind type. >>: I see. >> Jonathon Protzenko: And this is x of kind term. >>: X is already a type variable. >> Jonathon Protzenko: It is already a type variable. >>: So you're doing it in singleton type style. >> Jonathon Protzenko: Yes. >>: Okay. >> Jonathon Protzenko: Yes. >>: Okay. >> Jonathon Protzenko: It is a dependent type. I'm just going to run that example in a terminal. What appears in the terminal is not important. What is important is this. What I want to tell you is that the type-checker because of these singleton types keeps track of aliasing information. So let's move back to that example for a minute. What does the example do? It allocates x with is a tuple. It allocates a new reference whose contents field is x. And it calls get r and stores the result in y. I'm just going to put the two side by side. The type system knows that x and y are the same things. It knows that the contents field of r points to x which also happens to be y, and it also knows that the tuple here has two components that are integers. So internally the type-checker has a graph of objects, and that is done through the mechanism of singleton types. Why do we use singleton types? Yes? >>: So the problem: suppose that x has a reference to [inaudible] just 2 as a binder to 2. >> Jonathon Protzenko: I'm sorry? >>: Suppose that – So here you declared x or defined x as a tuple. [Inaudible]. >> Jonathon Protzenko: Yeah. >>: If you divide x as a tuple of ref 1 or ref 2. >> Jonathon Protzenko: Yeah. >>: So then the state of x would get [inaudible] tuple. >> Jonathon Protzenko: Yes. >>: Your get function would still work because x and y are equal in your type system. >> Jonathon Protzenko: Yes. >>: But so does your type system then let you write programs without deep copying your values or would you have to deep copy – When you started the examples I wasn't expecting a get function that would perform a deep copy. >> Jonathon Protzenko: Okay. >>: But it seems not to be [inaudible]. >>: No what happens really is that when you create an alias, like let x equals y, instead of deciding whether x or y gets information, whether we have x at interesting stuff or y at interesting stuff, we just remember in the type system that x and y are the same thing. And that's where the singleton types kick in. We have a special permission x at equals y. And we just add that. We do not try to do a deep copy to make sure that both x and y are references. We do not try to assign ownership to the other one. We do not do borrowing, like x borrows the reference from y or something else; we just put that equation on this side and the whole type-checker is able to deal with the equations. It can perform a writing and use one or another indifferently. And as in singleton types we keep track of the aliasing relationship between program variables, tuple fields and record fields. >>: Does this work with control flows if you have... >> Jonathon Protzenko: That is an excellent question. The short answer is we thought it would be a problem and it basically never happened in practice. Whenever control flow diverges – I'm going to get to that later – we do a graph traversal in parallel and we reconstruct maximum [inaudible] information. And we don't have problems in practice. I mean, we're able to reconstruct the aliasing information. So with the conjunction of permissions it is a graphical description of the heap and that's the way we think about it. I'm going to switch to a complete example. I am going to do the example of list concatenation. Let me just switch to my browser again. That is the definition of lists. It's either Nil or Cons. These are immutable lists. We have some helpers for constructing Cons and Nil, and we're doing append. So append it's just like an ML. You match an xs. If it's Nil, it's ys; otherwise, you just create a new cell until ultimately you make it point to ys. The interesting bit and the reason why I'm showing that example is that there is a consumes keyword. Because you have to take ownership from your caller of the two lists; otherwise, you would be creating aliases. If your caller can still point to the original list and the concatenation, the elements that are in xs and ys will be aliased. So the only way we can write that function is to ask it to require ownership and to steal ownership from the caller. That seems very restrictive and, indeed – Well, when it's a list of references, because references are uniquely owned, the caller loses the ownership of the lists. Right? If you try to assert that L1 is a list of references at this program point, that is not going to work. Let's try it. It tells you that it could not obtain the permission that L1 is a list of references to [inaudible]. So really the type system rules out this dangerous behavior. The good news, however, is that if you lists happen to be lists of integers, lists are immutable, integers are immutable. So L1 points to an immutable fragment of the heap, and the type-checker knows that. It knows that this permission is duplicable so it's safe to save a copy, use the copy for calling append and then keeping a copy for yourself. So these assertions are actually going to work. And we call it call-site polymorphism. Yes, type checks successfully. And the big example that I'm going to show you is that I'm going to write a tell recursive version of append. So we can already write a tail-recursive append in ML by doing two passes: first rev append and then rev. But we can write it in one pass in Mezzo and that's something that you cannot do in OCaml without resorting to unsafe implementation techniques. So how would you do tail-recursive concatenation of lists in one pass? You use three types of cells. The yellow ones are Cons cells. They are immutable but they do not point to the start of a well-formed list. The blue cells: they're mutable. They have an un-initialized tail field. And the green cells they point to well-formed lists. Here's a loop which we model as a recursive function. The loop works as follows: you have created copies of xs which are the yellow cells and then, you're basically trying to stitch dst onto ys by copying cells from xs. So this is a mutable cell. What you do is that you allocate a new blue cells which is the copy of xs that you have to do. You perform pointer rewiring because that is a mutable cell and then you freeze the blue cell. So that's going to be one step. And you move forward as you go. You allocate cells on the fly. You rewire each cell to the next one until you reach ys. And in that case, you're done. You've created copies of xs and then, you've rewired the end of xs onto ys. So you have a well-formed list. The problem is the type-checker does not know there is more reasoning needed to prove that this is a wellformed list. You have to say ys is a well-formed list. Dst is an immutable Cons cell that has a tail field that points to a well-formed list. Therefore, it must be the case that dst is also a well-formed list. Nothing happens at runtime; this is pure reasoning. >>: What is well-formed list? >> Jonathon Protzenko: Something that has type list a. That is not a well-formed up list because it ends up with a dangling tail field. So, that is not a well-formed list. >>: So where is that type list defined? >> Jonathon Protzenko: I can show it to you. >>: [Inaudible] just the middle column. >> Jonathon Protzenko: Yeah, this is list. >>: Okay. >> Jonathon Protzenko: And the definition of cell is basically the same thing. There is a head and tail field except the tail field is a placeholder and it's mutable. I think you had a... >>: [Inaudible]. >> Jonathon Protzenko: Okay. And so, yeah, after that you're done doing operations at runtime but you still need reasoning to unroll your recursion and to finally convince you that once the function is done, dst is the start of a well-formed list. I'm not going to show you step-by-step the entire example. What I want to show you is the type loop here; it's the appendAux that I've been mentioning in the slides. Yes. What happens is there is an additional reasoning that takes place at the type-checker level after the recursive call. Basically we're in a state where dst is our old blue cell, that one that has been made immutable so it has a Cons type. Dst prime is our new blue cell. Tail is our new xs and ys is the thing that we have to rewire our cells on. The recursive calls consumes the last three permissions and returns to you per its post-condition, the fact that dst prime is now a well-formed list. And we know that dst is a Cons cell whose tail field points to dst prime. It's basically the drawing that I made here. Previously we were in that situation and we are in the situation, and we need to perform the bit of recursive reasoning that tells us that dst is... >>: I'm confused about one thing. So if you go back to the type definitions. >> Jonathon Protzenko: Yes? >>: So you're saying that the Cons cell, [inaudible] Cons cell. >> Jonathon Protzenko: Yeah. >>: You're saying at some point you're allowing the tail to be something other than... >> Jonathon Protzenko: Yes, absolutely. >>: ...[inaudible]? You're breaking in the implementation the list invariant. >> Jonathon Protzenko: Absolutely. That's the question that we had before in the sense that we can represent concrete types that are not subtypes of a nominal type. We can represent... >>: That's why you can't do it on camera. >> Jonathon Protzenko: Yes, absolutely. And for that you need to have the reasoning that these things have a unique [inaudible] because you're mutating them into something that's completely illegal. >>: Internally within this function I can have some other types and I can wire things up, allowing me to do this mutation as long as in the end I can prove everything is back at the level of [inaudible]. >> Jonathon Protzenko: Precisely. >>: I guess you don't even have to prove that at the end you're back at a list. You could return something in the middle with something [inaudible].... >> Jonathon Protzenko: You can write these types if you want. >>: Suppose the data at type cell has two cases: cell and cell Nil. And now you've changed the tag of dst to Cons. >> Jonathon Protzenko: Yeah. >>: Your type-checker would have to check that you're not taking a cell Nil to Cons because you'd lose --. I mean the [inaudible] fields [inaudible] would not be defined [inaudible]. >> Jonathon Protzenko: Yes, absolutely. So the way it's type checked is that here we have dst at Cons – no, wait, no Cell. Head type a tail equals dst prime. And there it checks that the size matches; that there is exactly two fields and that it's legal to mutate the type. The field names may be different but the size of the block has to match per the runtime constructs. That is something that unfortunately we're exposing because we have runtime limitations we cannot expand in place [inaudible]. >>: So the assertions are [inaudible] appendAux. You say, though, that dst is Cons with dst prime but isn't dst – but then you say that dst prime is a list but it isn't, right? >> Jonathon Protzenko: Where? >>: [Inaudible]. >> Jonathon Protzenko: Dst prime becomes a list. >>: Oh, after the call. >> Jonathon Protzenko: Yeah, after the call. That is the post condition of appendAux. You pass it a thing and it turns the thing into a well-formed list. So you pass it dst prime and it turns dst prime into a well-formed list. And then, there's a last step of reasoning saying that if dst is a Cons cell, it still points to a well-formed list and then, also, it must be the case that it is a well-formed list. And the type-checker knows that and performs that automatically. These are just comments. These are not hints to the type-checker. That's just comments to explain what's happening. Okay, so I think I've explained that enough. The parent works because we have tail-recursive reasoning that is reasoning that takes place after the recursive call. And we have state change and freezing of mutable data. So I'm done explaining to you the base layer of Mezzo. These are the mechanisms that we are using: singleton types, the notion of is it duplicable versus is it non-duplicable and these intermediary states that are not a nominal type. So with that we can express several usages patterns: borrowing an element from a list, in-place link reversal; basically anything that has a tree-shaped aliasing pattern works. And that is restrictive. Somehow you may want to have arbitrary aliasing and that's my next part. I'm going to tell how we deal with the problem... >>: What's tree-shaped aliasing again? >> Jonathon Protzenko: So you have an object that owns other objects that owns other objects, and you don't have back pointers. So typically the example that we cannot express is that of a graph. In the graph you have pointers in every possible direction between nodes, and you cannot make it in a way that a node owns its neighbors because there are different pointers and that would be an ownership violation. >>: But there is an intermediate which [inaudible] so you can have some sort of – you can have a tree with an internal ownership... >> Jonathon Protzenko: But that would be an ownership violation because you would have two objects pointing to the same thing, so who owns them? We can represent local aliasing using the singleton types but as soon as it's arbitrary-sized, we can longer express it with the mechanisms that I have shown you so far. And actually let's do it. Let's try to... >>: So tree means just everything is owned [inaudible] once. >> Jonathon Protzenko: Yes, precisely. >>: But in a graph not everything is owned exactly once. Do you mean that the graph is a tree with back edges? >> Jonathon Protzenko: In a graph you cannot express a graph with what I've shown you. The graph definition would not type check. That's my example, actually. >>: You can't even [inaudible]. >> Jonathon Protzenko: You can't even do that, yes. >>: By the way, supposing – is it not possible to just do ordinary programming? If [inaudible] fancy type system and I just want to write a graph algorithm, can I do it? >> Jonathon Protzenko: You have to use the escape mechanism. We have an escape hatch if you will that allows you to break out of the restrictions of the base layer, but you have to be careful. >>: But at the beginning of the talk you said that it compiled down to OCaml. So does that mean that you could write your graph algorithm in OCaml and do that and link it against a program written in Mezzo? >> Jonathon Protzenko: You could. I don't think I've gone beyond the stage of a simple example program. But, yes, in theory you can link them because... >>: But you're not a super set of OCaml, you're not like allowing... >> Jonathon Protzenko: Oh, no. We are different in the sense that we are ruling out the programs that are [inaudible] OCaml because we think they're too dangerous. They're doing in-sync stuff with their aliasing. And conversely we're also allowing behaviors such as the tail-recursive function that are more fine grained. >>: So it's not a subset or a super set? >> Jonathon Protzenko: No. So if you want to interact with OCaml we could technically because Mezzo programs can link with OCaml programs but you'd have to be very, very careful. So what I'm trying to do here is I'm trying to create the graph with one node that points to itself. Basically a node just has a list of neighbors and I'm going to do it in a naive way that is going to fail. I'm going to say that neighbors is a list of nodes and that nodes hold a value. And I want to create the node that points to itself. So I'm initializing x as a node. Again, I'm putting a place-holder value in the neighbor field because I don't know what to write in there yet and then, I'm going to create a list of nodes so as to write it in the neighbors field. And that is going to fail. The reason it fails is that you already consume the permission for x as a node in order to create the list. And then, the permission is gone so you no longer have any useful permission for x. You no longer know that x has a neighbors field because that permission has been used already, so you can't do it. It tells you that the only permission you have for x is that permission called x at dynamic. So it has appeared already in the examples, so let me show you want x at dynamic means. That's going to be our escape hatch. I'm changing the definition slightly and adding a seen field because I'm going to do a higher order graph traversal so I need to remember which nodes I've visited. But more importantly I'm changing the definitions of neighbors to be list of dynamic. And now I'm defining a graph to be just a record with a list of roots along with a special directive called adopts node a. And we're going to see on the concrete example how that works out. So here I'm creating g which is a graph that holds integers with one node that points to itself. So, again, I'm creating the node. But what I'm going to do here is that instead of creating a list of nodes, I'm going to create a list of dynamic. Because I've used the permission x at dynamic, I’m still left with the permission x at node so I still have the information that x is a node so I still can write into it. The good thing is x at dynamic, it's duplicable. So even after that line I still have it. I can ask for some information about x. Whoops, sorry. I can call info x, and it's going to tell me that at this program point I still possess the permission x at node. I think it – Yes, I still possess the permission x at node and I also still possess the permission x at dynamic. So you also notice that the neighbors field of x is a Cons cell. >>: That's cool. So you extend the programming language with features that let you kind of call back into the type-checker so that it... >> Jonathon Protzenko: Yes. >>: ...gives you some information. >> Jonathon Protzenko: Yes. That's a special hook that allows me to poke into the typechecker, and the type-checker knows about that special function. And it dumps some information about a variable. That's very helpful for debugging. So the important bit is that I still have that permission x at dynamic so I can use it once more to create the list of roots for the graph. And now I'm using a special instruction called give x to g. So what does that mean? It means that I'm relinquishing, I'm giving up my ownership of x in order to transfer it to g. So I'm giving up on x. I no longer own x, but g owns x. How is that type check? I see that g is a graph; that graphs, they adopt nodes. X is a node so it's legal for me to give x to g. And after that... >>: [Inaudible] ownership is about data owning data. >> Jonathon Protzenko: Yes. >>: Usually ownership is about code access, right? Now you're... >> Jonathon Protzenko: Absolutely. Objects are allowed to own other objects. >>: Own other objects. Okay. >> Jonathon Protzenko: Yes. And now it tells me that at this program point the permissions that I have left for x are just x at dynamic. And that's good because I no longer have ownership of x. I have ownership of g and g has ownership of x. And intuitively that allows me to regain a tree-shaped ownership structure. I own g; g owns all the nodes. The nodes do not own themselves. So there are no back pointers. >>: What's info to g? >> Jonathon Protzenko: That's a special directive that instructs type-checker to print out some information. [Multiple audience comments] >>: He wants to see it. >>: Can you do info to...? >> Jonathon Protzenko: Oh, sure. It's not going to be very informative. Yeah, sure. It's going to tell you that g is a graph that has a roots which is a list of dynamic and that it adopts nodes of integers. >>: So your constants gone but, otherwise, you have subtypes on the data type as well [inaudible]? >> Jonathon Protzenko: Yes. That is going to subtype to the nominal type, graph a – or graph int, sorry. Yeah, I'm going to move on. So that's the dfs procedure. I'm not going to explain it into the detail. What happens here is the relevant bit. I need to visit node n. N is a node that is dynamic, so I'm taking ownership of n from graph g. I regain – At this program point I'm regaining the x at node permission. I'm doing my stuff with n. I'm calling the higher order function f on it. I'm appending its neighbors into the work list so that they are scheduled for visiting later. And I am also remembering that n has been seen by writing into its seen field. And then once I'm done doing stuff with n, I'm giving up ownership of n again by passing into the graph g. And only after that am I visiting other nodes. So I'm making sure that I give up my ownership before actually visiting other nodes. And of course for that to work, I need to have a runtime mechanism. How do I guarantee that these operations are legal? What happens when take is written, there is a runtime test. It dynamically checks that n is owned by g. And if I try to write take twice, it's going to fail at runtime. Yes? >>: Did you consider doing this at compile time and attaching the permissions of n to the permissions of g? >> Jonathon Protzenko: So conceptually you can think of it as this. You can conceptually think of g as possessing a list of all the elements that it owns. In the implementation we have something smarter using a hidden pointer from n to g. When n is owned by g, it has a hidden field that points to g. When it's not owned by anyone, the hidden field is cleared out. And we checked the hidden field of n against g in every operation. That's why we're passing both n and g in the take and give operation. >>: Yes, so I was wondering if you considered doing this management instead of at runtime, instead of using a runtime mechanism doing it at compile time by attaching it to the type of g? So when you say give n to g, it actually attaches your permissions on n to the permissions of g for the rest of the program. >> Jonathon Protzenko: That is not something that we have considered. >>: I guess it's your escape hatch so you would probably not want to do that. >>: I mean this is essentially a tradeoff, right? I mean this is where you go from the static flow to the dynamic flow... >> Jonathon Protzenko: Absolutely. That's our layer. >>: ...to simplify the type check. >> Jonathon Protzenko: Yes. >>: Because, otherwise, you would have to quantify all the nodes and... >> Jonathon Protzenko: Yes, precisely. >>: ...implement sums over [inaudible]. >> Jonathon Protzenko: That's our design choice. We wanted to say we keep the base layer of the type system simple. We don't want to have complex predicates like there exists at g which conceptually owns the memory region in which the nodes live and then the nodes are pointing to other nodes from the same memory region. We don't want to express that. So our story for the user is clear: whenever you have complex aliasing patterns, just use that runtime test mechanism and the whole system remains manageable. I'm just going to skip the type-checking because it's slow. I'm just going to interpret the program. So I fail – Okay, I'm doing the dfs by passing it to the print function and it's printing 10 because that's the value that's stored in my node. So let's see what happens if I do a violation, like let's say classic rookie mistake: I’m visiting other nodes before relinquishing my ownership of n. So I take ownership of n. I'm visiting its neighbor which is in itself so I'm doing a double-take. There is going to be an error message at runtime that says that the take instruction failed. The error message could be better but I'm just going to skip on that briefly. That's about all for that extra layer. It is very important, otherwise, there would be a whole class of programs that we can write, like any realistic program. So this is about delegating the ownership of the individual nodes to the graph. The whole point is that dynamic is a duplicable type that contains no ownership information. There is a runtime test that guarantees that the whole thing works. It's already about an hour so I may go a little bit faster. I'm going to talk about the implementation of Mezzo. So I designed the language with my advisor, and I was most responsible for the implementation. So what do we have now? We have a working type-checker, a compiler that emits OCaml code with [inaudible]. There was a big implementation effort for the type-checker. We have a cheap module system – Well, not cheap. We have a module system as you saw which has several modules from several libraries. We have the error messages. We have all the visualizations that you have seen with the graphs that appear, the web interface, one other that I'm going to show very soon, and we have several thousand lines of libraries for about everything: mutable data structures, mutable trees, iterators. We have [inaudible] the protocol of iterators, lazyness etcetera. Why is type-checking made so difficult? There are some very complex algorithms in the type-checker? Type-checking is similar to proof search and separation logic. You have a function call. You basically have to solve the frame inference problem, extract from the permissions that you own, adjust the minimal set so that you can run the function and leave the rest attached. As you saw, permissions that are represented internally using graphs and we have another algorithm that is similar to what is called in the field of abstraction interpretation a join on a shape domain. So I'm going to run another command. Okay, so this program – I think I may able to zoom it. This program does something very silly. In the then branch it allocates a reference and returns a tuple whose two components are pointing to the same reference. And in the else branch it's returning a tuple whose fields are two physically distinct references. So the type-checker knows that on the one hand we have this aliasing pattern where we have a tuple which is defined here. So let me just zoom out a little bit. A tuple which is defined here. A reference which is defined here. It highlights in your source code the locations on an integer that comes from here. And here we have the other tuples with these other references. So what does the typechecker do? Well, it's complex. You cannot say that we have a tuple whose two fields point to the same reference because that would be not true here. And conversely you cannot say that you have two physically distinct references because if the code takes the then branch, you're going to have something that's not true either. So what the typechecker does is pick an arbitrary site and decides that while the right component of the tuple is the reference and the left component of the tuple is nothing. It could've picked the other side. So that's an algorithm that we have. This problem never happens in practice. That's completely whacky. Programmers don't do that. However, I wrote that visualization to somehow try to explain to the user what is happening in terms of [inaudible] the decision that was made. Back to the presentation. The type-checker also performs type inference. When we have a polymorphic function we don't specify the type application; it is inferred. So it's kind of hard to infer because there is a lot of backtracking in the vault. You can have multiple permissions for the same variable. You always have x at equals x. You also have x at dynamic. You also have x at ref something. So there are a lot of different solutions to explore. >>: So you're sacrificing the notion of a best type then because you make this arbitrary choice. >> Jonathon Protzenko: Yes. It is very difficult to have a best type. We have CAD, like we know how to write the best type for a function. We can show the equivalence of several types together but, yes, the type-checker makes arbitrary decisions. There is a lot of heuristics, like there has been a lot of fine tuning. I would say in 98 percent of the cases, the type-checker goes to the right solution immediately. In some very complex examples like iterators or something, there is a lot of backtracking involved. But the typechecker is soluble to do it. >>: Do you have a characterization of the hardness of the type-checking problem? >> Jonathon Protzenko: Yes, it's very hard. I mean... >>: Is there a [inaudible]? >> Jonathon Protzenko: I would say it's NP complete just because we have quantification over permissions, so we can say for any permission p. And then if you want to infer that permission p you have to try every possible subset of the universe to figure out what could be that permission p that you have to pass the function. So in those cases we have heuristics like we may quantify over a certain p for map. The map function says that the small function that's being called repeatedly may have some internal state p and then, because our function types are annotated, the type-checker is smart and knows that the function that is passed to map tells you which p you should seek to obtain. But in the general case you have to annotate. Yeah. The type-checker also produces a typing derivation. If someone wants to check it later, we don't do it. But we may want to do it later. And there are also other extra modules that perform fixed-point computations. We have that thing called facts. The type checker is aware that list of a is duplicable as long as a itself is duplicable so that something that is computed using a fixpoint [inaudible] recursive data types. Quick wrap up on Mezzo: a language blends many ideas. We are keeping everything presented as a type system. It gives you strong guarantees and there is that escape mechanism when the base layer is not enough. What do we want to do in the future? We want to extend the language. We may want to have some patterns where you own a piece of mutable data but you may want to share with someone just as read-only. There's another big question that has been somehow keeping us wondering: if you are going great lengths to prove a program, how much does that help you prove it? You already have a very clear ownership structure. Is that something that helps you prove functional correctness? Can you leverage with the type-checking that the Mezzo program gives you in terms of guarantees to help you prove the functional correctness of your programs? We may want to express proof obligations as type-checking operations that are proof obligations for SMT solver, see if that works better or less than our custom algorithms. And we also want to explore visualizations. [Inaudible] visualizations that were mostly in the web browser. The reason is I've been contributing to Mozilla. I wrote some major add-ons for Thunderbird and I also wrote a book for developing add-ons for Firefox. And my big add-on is called Thunderbird Conversation; it gives you a conversation view in Thunderbird just like GML. So it's been interesting in the sense that add-ons are not written in standard Javascript lingo because they're written using the internal version of Javascript that lives in the Mozilla products. I get access to some nice features from the next versions of ECMAScript. That is the structuring assignment. That is a function expression. There is the let key word. Iterators, for each, etcetera, etcetera. So that is not research. However, it's been nice for the error messages that I've shown you, the ones where you can see the graphs and click on the nodes to see where they're defined. And also for the – Sorry about that – step-by-step examples. Whoops. Yeah, I don't think I'm connected to the Internet so it's going to be kind of hard. >>: MSFT Open. There's... >> Jonathon Protzenko: Can I do that? [Audience comments] MSFT Open. [Audience comments] >>: People who are anxious to leave [inaudible].... >> Jonathon Protzenko: Okay. >>: We have the room until noon. >> Jonathon Protzenko: Okay, I'm going to do this real quick. Let's click on the tutorial. Connect. Let's the click on the tutorial again. That's not loading. Okay, I'm going to get back to it later when it's loaded. The web interface that you saw, it all runs in the browser. I've compiled the Mezzo type-checker into Javascript using a thing called JS of OCaml and it only required minor modifications. Our dream is to have an interactive toplevel when you go back and forth and explore the permissions as they come and go in a visual manner. I think I'm reaching the end of my allotted time. So that thing does not seem to be intent on loading, so I'm just going... >>: We do have [inaudible]. Are you able to connect [inaudible]...? >> Jonathon Protzenko: Oh, yeah. It works. So that's the unfinished Mezzo tutorial. But what I wanted to show is that thing which kind of helps me in the tutorial explain what happens people. So you basically have two boxes with the duplicable and the nonduplicable permissions. And you can step through the program and see the permissions as they go back and forth and explore things as they happen in real time. That is not auto-generated; that's handcrafted. I wish for it to be auto-generated; that's one of our goals because that's really a nice way to explain how programs are type-checked. And really we wish to have everything blended together or edit the program, step through it, see the permissions. That's more involved. So really quickly why am I here doing the presentation? I am interested research-wise in better verification tools. What I learned with Mezzo is that it really is hard to provide a better experience for users especially as things get more complex. There're a lot of things to explain. I've been spending an hour explaining Mezzo to very competent researchers. I don't know how we can explain that properly to a user. We need to have better tools. We need to have visualizations, interactive stuff that really makes it easy to understand what's happening with your program. And I believe that these are research problems. The algorithms, for instance, to select the relevant bits of information that you want to show to the user, that's completely non-trivial. Showing everything, drawing the graphs, is that a good way to explain stuff to the user? That's not clear. And the tooling: I've been implementing that thing in the web browser and it's great. I mean, people love it. They can go onto it. They don't have to install Mezzo. That is really something to explore but how? How do we do it? That is also another question. User studies, user interface, user interaction: that's not something to be taken lightly. There is a lot of real work to be done. And I really do believe in the webbased tools. You reach more users. You have more impact. Deployment is made much easier, and I really believe that there is something to be done here. There is something that's been explored at MSR too. Can we have an ultimate web-based ID? I think that's a valuable research question. So really, what I would be eager to do is leverage the experience that I've acquired with Mezzo to contribute to projects that are being developed here on two axes. There's the types axis. There are a lot of projects that rely on types. There is Koka, a type system with control of effects and that's something that talks to me, right, speaks to me. F star: iterations on the design? Type system? There is also Lean; do we want to add separation logic theories into Lean? Can we somehow reduce the implementation techniques of Mezzo or the heuristics or the algorithms that pretty much do the same thing to teach that knowledge to the SMT solver? There are programs to be in written F star so really that's a lot of experience that would be eager to apply to existing projects. And also I think all the visualizations that have shown truly express the fact that I'm interested in what happens on the web, and I think that's something that is taking place here. Two things that I'm thinking of are Rise4Fun and TouchDevelop. And I believe that is something that I would be very eager to tackle and see what I can do about this. I've been trying to do it but I don't have – I'm just only one person so there is only so much I can do on these web examples. They need to be integrated, and I think that there's a lot of potential here. I think I've overlapped a little bit. But that was good, so thanks for the questions and for your attention. [Applause] >> Nikhil Swamy: We still have the room so if you have questions...? >>: I have one question. So there have been a series of attempts to create type systems that work for avoiding data races. Have you looked at any of the kind of programs that those [inaudible]? I mean especially parallel algorithms I guess would be a good test for a type system. >> Jonathon Protzenko: One program that I work with and feel is like that very textbook example of a concurrent queue where you have threads waiting on a condition variable and you have to make sure. That is something that is very neatly typed in Mezzo because we have a dependent type that says that the condition is that of the lock called L and it already rules out several errors where you broadcast a condition with the wrong lock or you mess up. So that would be pretty much the standard paradigm. One thing that we've wanted to explore is that here the graph, g, that I showed is uniquely owned. So you cannot have two concurrent threads trying to take from g at the same time. They need to have a lock on the graph g before they can take nodes out of g. There is due to the internal nature of the thing. And having g uniquely owned guarantees that our mechanism is sound in a concurrent setting. What we'd like to do is extend that give and take paradigm to objects that are immutable, so for that you need a compare and swap operation. But that is still feasible. That maybe would be a novel where you have a builtin paradigm for expressing people that try to take maybe jobs out of a queue concurrently and possibly fail. That would be failable operation. >>: How about something simple like parallel merge sort? >> Jonathon Protzenko: Parallel? >>: Parallel merge sort? Parallel sorting? >> Jonathon Protzenko: Oh, the one where you – Yeah, I guess that would – I didn't write it but I don't see any difficulty as long as you do it with lists. >>: The difficulty usually in that case is that you're talking about indexes into an array so you have to reason about [inaudible]... >> Jonathon Protzenko: That's why I said lists. With an array, it is very difficult. >>: Okay. >> Jonathon Protzenko: We've an intern work on that and if you arrays, you have to start talking about arithmetic portions of the array, who owns the array from those indexes to those other [inaudible]. And that is something that is still an open question in Mezzo. [Inaudible] works well as long as you use this. So you do the partition. You get to different lists. Spawn. Ownership and then merge them. That works. With the array? I don't have an answer for the array. >>: I have another question. So if your type system is in place, what are the kinds of programming errors that [inaudible]? >> Jonathon Protzenko: So the first big guarantee is that programs written in Mezzo are data race free. So that rules out every access to immutable data that is not protected in some way or another. That really is our big motto. Another thing that we're trying to sell is that we have that mechanism for expressing intermediary states. So there are patterns where you progressively initialize objects. That's something nice that we can type-check as well. But really we did it – we built everything from the ground up with the end goal being to have an answer for concurrent programs. >>: There is a piece of work that Matt Perkins was doing with the [inaudible] team on the M sharp language. I think it was published in [inaudible] a few years ago. Many aspects of your system seem a bit like that. Are you familiar with that work? >> Jonathon Protzenko: It does not seem to ring a bell. >>: Okay. >>: So why did you choose specifically going with the functional languages [inaudible]? And how much of this could you transfer to other languages? People try to do [inaudible] in comparative languages, I mean OCaml of course is kind of [inaudible]. But still what was the main force you went there and how much of it is applicable in other languages? Like type qualifiers or something comes to mind like when people try to [inaudible] similar things. >> Jonathon Protzenko: There are two points that I can bring here. The first one is that it's mostly a cultural thing in the sense that in the team we're familiar with OCaml so we initially went for something that goes a little bit further than ML. Another thing that may be worthwhile is that, I don't know if you've heard of the Rust project on Mozilla? They're pretty much trying to do the same thing: a language with a control of ownership. They're not using the same mechanisms. They're using lifetimes instead of permissions, but the goals are the same. But they're in a much lower-level setting in the sense that they're talking about what is allocated on the stack, what is allocated on the heap. And that's makes their lives much more difficult because they have to talk about the different varieties of objects depending on where they live. And that is something that we do not have. So I think we simplified our lives by going into the ML setting. Everything is heap allocated. Everything is pass-by reference. There is a garbage connector. And that allowed us maybe to focus more on the design ideas rather than the interaction with the low-level world. So I guess the ideas are applicable to a wide variety of languages but they would probably run into some concrete issues if you tried to apply them, say, to a language that has stack versus heap allocation. Yes? >>: So I'm interested a little bit more about the object layout. Now I noticed when you were writing the types, like the intermediary types that were, I believe you said not derived from a nominal type? >> Jonathon Protzenko: Yes. >>: So you had them written out as you would have the tag and then you have an open curly and then you have names, position, labels... >> Jonathon Protzenko: Yes. >>: ...set to be types. >> Jonathon Protzenko: Yes. >>: So you can only change the tag to other nominal types that have the exact same position labels and the exact number of fields? >> Jonathon Protzenko: No, the name of the fields do not matter only the size of the block matters. If you change from tag a to tag b, your thing gets renamed with the field that belonged to tag b. So we're structural for the field names in the sense that they're updated as you mutate the tags. >>: And then, the second thing is so because you can exchange the types – or, sorry, you can exchange the tags between pretty much anything, I assume that means that all of your tags then have to be assigned a unique [inaudible] across the entire program space then? >> Jonathon Protzenko: I don't think so. >>: How do you distinguish between the tags [inaudible]? >> Jonathon Protzenko: Type-checking gives you runtime safety. >>: Oh, okay. >> Jonathon Protzenko: We know that the compilation is going to be fine. >>: So you don't actually keep – so those are not actually tracked at runtime then? >> Jonathon Protzenko: Well the tags exist at runtime. The compilation model is that when you have a data type with several tags, they're allocated in sequence. Right? 0, 1, 2, 3 and then that's used for compiling pattern matching to discriminate on the tag. That's all we need. So we just need tag numbers to be distinct across branches of the same type. You just need that to compile your pattern-matching property. >>: Right. And you can't pattern-match between different nominal types? >> Jonathon Protzenko: No, you can only pattern-match on the things from the same nominal type. >>: So when you passed [inaudible] that also gave the time at the end, right? I mean when you took ownership from the graph and put it back I guess [inaudible] or whatever, but also gave type to the variable because before it was [inaudible] and then after that... >> Jonathon Protzenko: Yeah, so we looked up the definition of g. We see that g – that's not a. I don't know if you remember... >>: So it can then only adopt one type. >> Jonathon Protzenko: Yes. It would be unsound, otherwise. And that is something that we have explored as well and you can only have one object adopt one type. Or you just do a variant type if you want to have an object adopt multiple things. That could define a new type: thing a of thing a else thing 2 of thing 2. >> Nikhil Swamy: All right. Let's thank Jonathon. >> Jonathon Protzenko: Thanks. [Applause]

>> Nikhil Swamy: Okay, well thank you for coming... Jonathon Protzenko visiting us for the next couple of days...

Related documents

Products

Support

&gt;&gt; Nikhil Swamy: Okay, well thank you for coming... Jonathon Protzenko visiting us for the next couple of days...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Nikhil Swamy: Okay, well thank you for coming... Jonathon Protzenko visiting us for the next couple of days...