>> Rustan Leino: Good afternoon everyone. I’m Rustan Leino. It’s my pleasure to introduce Tucker Taft who will talk to us about his new language design. So Tucker has been involved in languages and different program analysis techniques for decades, starting at Intermetrix in Boston in the 80’s and 90’s and then doing his own company SoftCheck in 2002, which more recently was acquired by ADA Corps, which I guess seems to have collected a number of companies. So today he’ll tell us not about ADA, even though he’s been quite involved in ADA 95 and 2012, but about ParaSail. So, welcome. >> Tucker Taft: Thank you. So I thought I might also say, you know, why am I here? Well, I must Rustan at a very nice little conference in Maine of all things, and I brought along my Maine hat just incase there was any doubt that I’d been to Maine. And anyway, it was I think we sat in on each others’ presentations and there was a tremendous amount of overlap on what we were having to say about what we believed in and what we were trying to accomplish and so on. And so Daphne and ParaSail I think are trying to address some of the same issues. And this morning I was walking over here from my nice little B and B and I was thinking, well, let’s see, ParaSail plus Daphne -- ParaDaphne? ParaDucks. So that was my official name for the attempt to combine these two. Anyway, this is about ParaSail, not about ParaDucks, but I think you will see some similarities and overlap at least if we get into the stuff about assertions. I’m going to start talking to you more about the parallelization attempt and -- So I set out a while ago to start designing a new language. I’ve been involved with ADA since 1979, roughly, which is before ADA was even a standard. And got very involved pretty soon there after. But in 1990 we won the contract to do the first revision of ADA, ADA ’95 as it’s now called. At the time it was ADA 9x because we didn’t know what it would be. And I was the lead technical person on that. But well before that, in the 70’s, I started getting interested in programming language design. I don’t know why, but it was sort of one of those things, you know, that hits you and I’ve always been interested in it and, as I said, about three or four years interested in starting over. Let’s just blank sheet of paper, what could we do if we really wanted to start from scratch? And one of the reasons to do that was this whole multicore thing. That is we’re now dealing with a world where the number of processors is going to be growing or doubling, perhaps every two years. We used to be having twice as fast processors but now it’s going to be twice as many and that’s a fundamentally different problem to solve. In the old days we could sit on our duffs and our software would get twice as fast, now we actually have to get a little more clever to take advantage of all those extra processors. So anyway, this language is called ParaSail, for Parallel specification and implementation language. It’s intended to be a very formal-oriented language. It has built in assertion syntax and so on, and pervasively parallel. That is, it’s easier to do things in parallel than sequentially in some ways. Starting from scratch doesn’t really happen. So I have to admit that I studied a few languages in my time, and you will probably see plenty of them coming through on some of this, but there are a few things that I think are unusual. And it may be the combination, which is perhaps unique. Anyway, and this is a little bit of insult on C and C++, which I’m probably insulting at least some of you here. But my belief is that C and C++ are two of the least safe languages designed in the last forty years, and they are winning in the safety-critical world. I’m very involved with Safety-critical systems and it used to be almost 80-20, 80 percent were writing in ADA and 20 percent were written in C, but as the number of Safety-Critical systems has grown and they’ve sort of branched out from the military industrial complex into sort of just commercial world, ADA has not followed that path. And so now instead of 80-20 ADA to C, it’s more like 80-20 C to ADA. And C++ is used not at the highest level of criticality, but it’s certainly growing in use in some of the safety-critical systems. So that’s kind of, you know, I understand why ADA did not make the jump. Is that me? I don’t know who that is. It’s not me. Anyway I don’t think that’ll bother us too much. What’s the beeping? >>: It’s the battery back up actually. >> Tucker Taft: Battery back up? Is that for my Pacemaker? You heard about the guy who hacked his own pacemaker? That’s happened recently. That’s kind of cool. That’s just to prove that you can hack into anything I guess, especially if you’re a little crazy and you have a pacemaker. Anyway, so one of the things that I think most of us in this room recognize but not a lot of people elsewhere recognize, the computers actually stopped getting faster in about 2005, which is longer ago then I would have guessed, but my laptop is looking a little old here. But it’s still three gigahertz and you go out and buy a laptop today it’s kind of still three gigahertz, you know? And if things had kept doubling, which is what they were doing for 25 years or so, this would be like in the 50-gigahertz zone, if I got a new one that is. Anyway chips are going to start having more and more cores. The other thing that I think has happened is static analysis, and that’s represented well within this group, has actually come of age. And it’s time to get it into the language, you know, stop beating around the bush where you have a separate tool or whatever, you know? Let’s get into the language. This is just a slightly enlarged version of that IEEE slide. And it’s pretty dramatic what happened in 2005 where they just started cooking their chips and they decided that was not a good idea and maybe we should try to figure out how to make them higher performance in other ways, and that’s where all this multicore stuff started. It’s interesting to go back and look at articles written in that zone, 2004, 2005, 2006, where they realized that things really were going to change. Intel calls this their right turn, because this is a huge deal for them. They were having no trouble selling a new chip every couple of years to people with laptops when it was twice as fast, you know? You got a 1-gigahertz? My goodness, come on! 3-gigahertz is what it’s all about! But now it’s kind of hard to convince people to say, oh you got to have a new laptop because it’s sort of prettier, or, you know, thinner or something. Anyway, I mentioned I drew from various languages. IML was certainly a source of much inspiration. ADA certainly. Cyclone, I don’t know if you know Greg Morissett but this is something he worked on with regionbased storage management. CUDA, Cilk, a lot from Cilk, anyway a lot of these influences will show up probably. So I had this idea, new programming language, and I knew I wanted to have lots of good parallelism and lots of good formalism and I really knew that I was going to start adding things that it had to start from a pretty simple core. So I start ripping things out from languages that I loved to try to get down to the absolute bare minimum. And I think if you can do that that’s generally a good thing. If you’re trying to do parallelism formalism, well one thing was to make conceptual room in people’s minds. You know, if the language is already complex and then you start adding more into it it’s going to be really hard to deal with. But the other reason was I wanted to be pervasively parallel and safe and easy to verify. Maybe easy to verify is an overstatement. Straightforward, that’s my usual word for when I can’t say easy, but easier to verify. And getting rid of things is definitely a good place to start when you’re trying to do that. Parallelization, parallel by default, every expression is parallelizable. So if you write F of X plus G of Y, F of X and G of Y can be evaluated in parallel and the programmer doesn’t have to worry about it. It happens automatically. If you look at languages, if you’re familiar with Cilk, it’s a language that was done it came out of MIT -- Charles Lizerson [phonetic] I believe, and then was bought by Intel and they’ve put more energy into it. It has very easy parallelization but you’ve got to say spawn. If you want an expression evaluated in parallel you say spawn this or spawn that, and it’s caveat [inaudible]. If you say that, they’re assuming well then it must be safe to do that. There’s no real guarantee that that’s a safe thing to do. I wanted a language where there wasn’t that issue that the parallelization was always safe, and if it weren’t, the compiler would catch it. And make it so easy and sort of baked in that it’s easier to write parallel than sequential. Because I think if we get our programmers to all write it this way we have to make it really easy. And then from the formalization point of view there’s been a growing acceptance of pre and post-conditions, type invariance, that sort of thing, getting them right into the program, and in my view the compiler ought to be the one that’s enforcing these. Now, it may be a smarter compiler than your average bare, maybe it has Boogey and Z3, even its back pocket. But I think from the programmers’ point of view it ought to be part of the same tool. Every time you compile it that gets checked. And if you can do all that and compile a time, then do we really need runtime exceptions, the overhead runtime exception handling and the conceptual overhead of dealing with exceptions, which make thoroughly testing a system much harder? No. So here’s my little list of simplifications. And actually it is more than just these, but this is already enough just to have something to chew on. So no pointers. That actually kind of came last. It was not where I started at all. I knew global variables were going to be trouble. If you’re trying to parallelize f of x plus g of y and you’re not allowed to look inside them, you’re just looking at their spec, well, either you’ve got to be very explicitly about what modifies what, which is a pain in the neck for programmers to maintain, or you’ve got to just say well, no global variables. Why do we need them? Well, you know, we’re used to them or it’s painful to pass around parameters and so on. But I was willing to give it a shot anyway. No hidden aliasing. That is when you get into procedure, if you’ve got two parameters, X and Y, you can rely on that fact and one or the other of them you can do an update on, you can rely that their not the same parameters. So if there’s any aliasing it’s things like A sub I versus A sub J, those might be the same, and that’s determined by whether I equals J. But X versus Y? Well, there’s no way those could be aliased unless you pass them as a parameter, for example, or one was a global and one wasn’t. So getting rid of that sort of aliasing, and then various other simplifications. This one is I think relevant to people thinking about parallelization. A lot of languages have explicit threads or are adding them, as C++ just did recently, and I think that’s not the way to achieve massive parallelization or pervasive parallelization. It’s like asking programmers in the old days in C you had to say I want this variable in a register. So you’d name like, you could pick three of your local variables, and you could say register, and then the compiler would, you know, yeah, okay, that’ll be register three, that’ll be four, and that’ll be five. And at some point we said, you know, that’s not really the programmers’ concern. There are too many registers. We got 16, now we got 32 registers. I can’t have them trying to figure out which one of these local variables should be in registers. And it’s probably some of the temporaries that should be in registers. So as the number of cores begins to grow, expecting the programmer to say, well, I think this should be a thread and that should be a thread and, you know, one core for all this code and one core for all this code -- well, they’re too many. Maybe there are a hundred threads -- or a hundred cores. It just gets out of hand. Same thing with locking and unlocking, you know? If you have to kind of worry about, oh jeez, I guess I better lock because might there be two threads, that’s just not going to work. And again the idea of waiting and signaling doing it explicitly where okay I’m waiting on this condition variable and hoping somebody is going to signal me at some point, and I hope they don’t do it because of the right reason and so on. Trying to get rid of some of those things. No race conditions. That is that compiler insures that there are no undesirable simultaneous access to shared data. If that is possible it would either complain or in some cases perhaps it would insert additional synchronization, but the general rule is that the language is designed so that the compiler can complain about any such possibilities. So why those simplifications, and in particular, why pointer free? I mention F of X plus G of Y, and a lot of it comes down to thinking about that problem. We’re just trying to safely evaluate F of X and G of Y in parallel, without looking inside of F and G. So what does that mean? Well, let’s assume x and y were parameters coming in to us, and now we’re calling off F of X plus G of Y. Well, we probably ought to be able to say that those aren’t aliased. So no global variables, clearly helpful if f or g both manipulate the same global variable they might be stepping on the same object. No parameter aliasing is important, which I already said several times. And what do we do if X and Y are pointers? Well, I’m not even sure what that means any more. If I’m saying no global variables, what can they do with a pointer? Well, is d referencing a pointer referencing a global variable? Depends on your model of how global variables work, but let’s assume it’s not. We have this fundamental problem. Can I get X and Y, is there a danger that f or g could follow the pointers I give it and ultimately arrive at some common object, Z? And the answer is without a lot of analysis, that’s probably quite possible. You don’t really know where pointers ultimately lead. You can do various kinds of static analysis to prove that X and Y are separate, you know, separation logic and so on, but for not expecting the programmers to do a lot of annotations that are some how going to allow you to prove that, we’re not allowed to look inside f or g. Pointers make life a lot harder. Furthermore, if we have parameter modes like in-out, or in, or var versus non-var, how does that relate to the things that they point at? In a language like ADA, which has parameter modes. You can declare pointers that only give you read-only access to the object they designate. But then it only ever gives you read-only access. If you pass a pointer as an end parameter and it’s an access to a variable, then the caller can update that variable. So parameter modes don’t really help very much here. You’ve got to be able to know well, what are they going to be able to do that they reach through levels of indirection. So after almost a year or two of doing language design I finally arrived at the answer. Well, let’s just get rid of pointers completely. And I started trying to write code that didn’t use pointers. Pretty early on I realized that I needed to represent things like trees without pointers, and the way to do that is essentially to add to every type and additional value. This is like maybe types in many functional languages and so on, and pointers for that matter have null values. But don’t think of null as a pointer just think of it as like a very small value. Or it might be equivalent to sort of uninitialized, but it’s got a name and you can check for it, so it’s not quite as bad as just a random bit pattern. Anyway, so for every type, be it a record type, or a ray type, or an integer type, or a Boolean type, we’re going to add an extra value to that type. So it has all its usual values, true, false, and null. And you can use null when you declare an object or a component or parameter to be optional. And that basically means, okay, that might be a null value. If you don’t say optional it can’t be. It’s got to be one of the normal values of the type. But if you say optional, then that allows you to put null into that object. And my conclusion was that a lot of uses of pointers are for that purpose. That is that purpose of starting off with nothing and then putting something there. And a pointer is sort of the only way to do that in many existing languages. But if you just sort of say, well, let’s add another value called null so I can start off with nothing and then I can put a tree there. Or I can start off with nothing and put in a ray there. And that’s one of the big uses for pointers. Now, if you’re not going to have pointers that assignment had better not be pointer assignment because then suddenly you’ve got pointers again. So assignment is going to be by copy. But that’s kind of painful. Do we always want to do it that way? Suppose I’ve got a big tree here and a big object I’m trying to put into that tree, or I’m trying to put an object into a set or into some other kind of container. I don’t want to make a big copy of it and then delete this, so why don’t I just actually be able to move it in? So if you’re not going to have pointers, then you may need additional operators that are similar to assignment but are for the purpose of fundamentally moving a piece of data into a container. Or swap is the same thing. If assignment does copying, you’re trying to, you know, switch two halves of a tree to do balancing or something. If I have to do it by assigning to a temporary that makes a copy then I assign this over to here and that makes a copy, then I assign over here and that makes another copy, that’s a lot of copying. If I can just swap the left and right halves of a tree, then that’s much nicer. So that gets you a big portion of the use of pointers, and then let’s generalize the notion of arrays. Fortran had arrays it didn’t have pointers. It didn’t have records for that matter when it started off. And people managed to survive, but it was painful and pretty quickly we got away from that if we could. But if you have containers that allow the user to define what indexing means, so it might be a hash table or it might be a set or it might be an array, it might be some kind of expandable vector, but it’s got the notion of an index so you can I want to get the nth element, or I want to change the nth element, or maybe it’s the element whose key is ABC. There’s a general notion of an index or a key, and there’s a general notion of a container. And if we allow the programmer to implement these sort of containers and then use indexing syntax to refer to them, perhaps we don’t’ really need pointers even for cyclic data structures. We know for tree-like structures we can do without them, but for cyclic data structures it’s pretty hard to do without them unless you have some alternative. And so let’s just make indexing very general. The individual elements in your container can grow and shrink because they can be null and they can become nonnull values. So we can write things like for each N in directed graphs of I dot successors loop, and the successors are not pointers, they are a set of node IDs and we are now iterating through all of the successors of some graph and we can add edges to the graph, we can add nodes to the graph and so on, but we don’t really need pointers for that purpose. In fact, in my experience, when I started looking at code I’d written in the past using a language that had pointers to do directed graphs, I frequently didn’t use the pointers to do the directory graphs. I frequently did it this way anyway for other reasons, like the nodes often have existence independent of whose pointing at them. You create a graph with a bunch of nodes and then maybe you add edges and remove edges and so on, but you don’t want the nodes to just disappear because no one’s pointing at them because they have some existence independent of the edges connecting them. So it’s not obvious that pointers were the right solution for a directive graph anyway. So my hope was, well, given this I can do anything that I could have done with pointers. And with this additional stuff I minimize the actual need for them. The other thing that came along with this was objects are now sort of self-contained. That is, they grow and shrink, but they don’t dangle, if you will. You can’t have a dangling reference to an object because there are no references at all. Objects live conceptually in a stack frame where they are declared and they can grow and shrink, but no one’s going to have appointed them that might outlive their lifetime. So that allows you to change your whole storage management approach. So no global heap because you don’t need to create objects in places where they can live indefinitely. They can only live as long as the scope in which they exist. So we now move all of our storage management to the local region-based storage where they objects are local and there’s essentially a local heap, which is essentially what a region is. And their growth and their shrinkage are all within that local heap. >>: So [inaudible]. So you’re not going to really have the heap in the sense, well, you’re not going to have cycles, really explicit cycles, because you don’t have pointers? But you’re going to simulate that because you’re going to say, well, you know, the index of this thing is over there in that collection in index J. You’re going to ->> Tucker Taft: Well, index is just a value. It’s just a key of some type. You can use it to some degree as you see fit. It’s a scalar value. It’s a string, you know, and you can store it in other -- I mean here I’m imagining I’ve got a set of those [inaudible] stored in the successor’s set, but I could have come up with, I mean, where did I come from? Maybe it was passed in as a parameter, maybe it was the resulting ->>: [inaudible] out of bounds? You said you had no runtime exception. So you still have the notion of a data structure with some bound to some extent. >> Tucker Taft: Right. So there are preconditions on using [inaudible] under certain circumstances. And, for example, it might be that it’s got to be in the range 1-100 if it’s a hundred element array. And so those preconditions are things that have to be actually checked at compile time, and you have to prove that you’re within the range. Or the alternative is that you might have a data structure like a map where it returns a null if there’s no value at that index. So those are the two ways you would deal with that. Does that answer your question? Okay. Other questions? And one thing that you often, that you do bump in to, is you now have got a tree let’s say, and it’s maybe fairly elaborate, and you’ve now asked to have this sub-sub-sub component replaced. It’s currently null, so it’s sort of like a little stub on the end of a tree somewhere. I’m now going to put a branch there. And let’s suppose I do this by passing it as a parameter as a var or a parameter, that particular stub I want you to stuff something into it. Well, currently it’s null, and I want to be sure that you put that new piece of the tree in the right region. So either I have to pass in to every procedure an additional indicator of what region I want you to put things in if you allocate them and add them in my tree, or the tree itself has to identify where should that addition go. Does that make sense? You’re trying to make certain that all of the pieces of this tree live in the same region. And I’ve now asked you to plug something into this null part of my tree, put a non-null piece into that. So what we do is we make each null identify what region it is for. So null is not just the value zero, it actually identifies, it says I’m null, but by the way, if you want to replace me with something non-null you’ve got to put that in region Q, or example. Yeah? >>: It seems a little odd though, if you have a tree with holes like that that are associated with particular storing regions, what happens when you lose your copy operator to copy the whole tree into some other region and now these guys are still talking about the old one? You got a whole bunch of fix-up to do. >> Tucker Taft: Correct. When you create a copy the null and the new value identifies the new region in which it’s living. >>: So although in the original tree they could be referring to regions other than the tree’s ->> Tucker Taft: No. Okay, they’re all referring to the same region that is the tree ->>: So there’s some notion of parentage that a null carries the region of a container it’s living in. >> Tucker Taft: Exactly. And it’s the place where that object was declared that fully determines everything about all the storage within it. Now, it is possible to have what you would probably call a pointer. Short-lived references to existing objects, for example, when you pass a parameter we don’t make a copy, we just pass a reference to the object that you’re referring to. But it is a hand-off if the object is being updated by that function you’re calling. When you pass it you are essentially logically handing it off, and no one else is allowed to update it while it’s being manipulated by that function. And you can also return references, but you can only return references to things that are passed to you. And that’s how indexing is implemented. You have to be able to allow users to implement their own indexing. What can they do? You pass them a map; they can pass you back an index to an element of that map. I mean a reference. So trees. I’ve already talked about this but this is a little more specific. How is a tree represented? So here is an example. In ParaSail, every module has an interface and a class that implements it. The interface gives you the external view and the class gives you the implementation. This is an external view of a non-opaque type. We’re essentially exporting how is this type represented? But you could put this all in the hidden part of the class itself and then just provide operations for, you know, sending and getting these things. But at this point this is just indicating how would you represent a tree or a node of a tree in ParaSail? If a payload, which is of the type of the parameterized type here, and left and right, which are optional trees default initialized to null. So that’s what a node of a tree looks like. And then they grow by having trees plugged into the left or right component. Does that make sense? So here are a few little examples of creating a tree. We declare a variable root of type tree node over strings, and we initialize it payload, with the payload having the value top, and left and right are defaulting to null. Then we can assign into the left tree a new tree node with payload L and with the right tree being payload LR. And conceptually these are values, as opposed to things that are pointed at. These are just values whose size is more dynamic than the average value, but conceptually it’s a value semantics. And then if I wanted to move some part of the tree from one place to another I can use this move operator, which has a side effect of leaving the right hand side null. So essentially, at a semantic level it’s copying the right-hand side into the left-hand side and setting the right-hand side to null. If these two happen to be from the same region, you can do that without physically copying. You can just essentially move the underlying pointers. But from a semantic point of view, this is assignment followed by setting the right-hand side to null. Does that make sense? Okay. Anyway, a little bit about region-based storage management. Cyclone, this is Greg Morissett and a few others -- Yeah? >>: Just on the previous assignment. [inaudible] that you did it the other way around? Root dot left dot right gets root dot right. If you then first do the move and then put null in the root dot right, you’ll cut off the thing that you just moved. So does one have to be contained with [inaudible]? Do you have to make sure that one is not contained in the other when you do the move? >> Tucker Taft: No, it is smart enough to not lose the data. So you can, for example, remove an element from a link list by replacing the link list with its second element, essentially, and it will essentially then remove the intermediate element. Does that make sense? Okay. So region-based storage management. You may be familiar with it. This is a little bit of a twist on it in that in Cyclone there are pointers, but the pointers have an implicit region associated with them. And so what they really, you can think of them as being [inaudible] into that region. They’re not general-purpose pointers, they can’t point arbitrarily in arbitrary places, essentially they can give you a reference into that region. But the other thing about it is that regions live in a certain scope and they’re local to those scopes. And when you leave the scope you reclaim the whole thing, which simplifies at least storage deallocation [phonetic]. So we’re getting the same advantage of that, but it’s simplified even further because there’s only one pointer to every piece of data. And so when that pointer is set to null you can immediately reclaim that storage. So you have these local regions, which are essentially local heaps, and the space for an object declared on that scope is allocated from that region. And when this object shrinks because you have a non-null part of it you set to null, that is immediately returned to this region. So the region has its own little storage manager, if you will, and it’s very simple initially if there haven’t been any things returned. But if you start returning stuff then it’ll do the usual list of reusable pieces or something. But then when you leave the scope you just flush the whole thing because you know all the objects that are living in that region were declared in the same scope. They can’t be from an outer scope. And as I mentioned, move and swap are very cheap when you’re staying within the region. Analogous, if you think about a file system like on Unix maybe or these days a USB card, and I can move a file within the USB card and it’s very fast. Oh, that’s no problem, it’s just sort of changing some pointers and so on. If I move it onto my laptop it’s actually got to copy it, same idea here. Each region is like a separate disk, if you want to think about it that way. In the Unix world a separate drive. And if you move within the same region it’s very cheap, or swap, if you move across then there is some actual copying. To help in this process is if you know you’re creating an object that is going to be moved into a particular container, so I’m creating this object, I’m going to move it into this tree, or I’m creating this object and I’m going to move it into this set, then when you declare it you can say it’s being created for the purpose of becoming part of that object. And so here it will declare this object and allocate its space out of this region, which you know will outlive this object. So essentially this is like a late declaration of an object that’s going to be living in an outer region. And then at some point we can move X into the root object here, which is what I’m showing. Root dot left dot payload, we move X into that and now we know that’ll be cheap. This is just a hint. It can be completely ignored, it wouldn’t change the semantics but it affects the performance. Does that make sense? The other thing about using local heaps is that they’re generally better behaved for parallelism because now objects are generally not scattered all over the global heap indicating at what point they grew. And if you’re adding something to an object and then quite a while later you add another thing to the object and then another thing, then the object is spread all over your global heap. On the other hand, each object lives in its own region, then you know that the space you’re allocating for it is going to be coming out of that same region. The other thing is that when you have multiple threads that are all allocating space in parallel, then they’re generally each working in a separate region when they’re doing that if you’re doing region. If you’re doing global heap they’re probably allocating out of the same global heap, and you end up with unrelated threads that are allocating storage right next to each other. And if they’re going to be updating that storage that’s really bad news because you’re going to get into cache false-sharing type things where they’re essentially knocking each others’ data out of the cache. So region-based storage management is essentially better behaved when you start getting into multiple cores. Yeah? >>: So what do you do about [inaudible] state? If I have a long-running loop that’s carried state from each region to the next and I have to declare a region that encloses the entire loop in order for the object to be live, right? In which case I would [inaudible]. >> Tucker Taft: Loops are implemented by having each iteration be its own thread, and when you start another iteration you can pass it some parameters, and those might be what it needs to do the next iteration, for example. >>: So each iteration loop is a separate thread. So how do you do loop [inaudible] state? >>: Tucker Taft: At the end of one iteration can pass a parameter to the next. I don’t know if that’s answering your question. >>: In which case you wouldn’t be able to, I guess you could reuse a thread package. >> Tucker Taft: Yes. The thread is now usable. I’d have to see an example to exactly answer your ->>: So the thing in Cyclone where one thing was a problem was the regions were not sufficient to be able to write [inaudible] programs. You ended up having to do things like having linear types and so on so that you could escape the less [inaudible] of a region-based storage management system, so that you could do things like [inaudible]-state. Because otherwise you would just run out of memory if you were just sitting in a large lop and you, you know -- you would never [inaudible] region. >> Tucker Taft: Well, why is the loop-carried state growing? That’s what I’m not sure about your ->>: Everything that was, I mean, it may grow in some cases, but it will all be out into a region that contained the entire loop. So you wouldn’t free that region until the entire loop exited. >> Tucker Taft: Yeah. I guess I would have to see the example because each iteration loop has its own region potentially, if it needs one. And then you have the outer region. But the loop-carried state is, I mean, I’m not certain what makes it grow. I’m trying to think of examples where it would be growing and that would be helpful to see. I do have some examples of loops and maybe in there you’ll be able to say oh, a loopcarried state and that’s going to kill you, so maybe you can save it for those questions. Okay. So I’ve been kind of diving down into the underlying guts of ParaSail and no pointers and so on, but at a higher level, what does it look like? Well, it’s an objectoriented parallel programming language. It has interfaces and classes. One thing a little different about it is every class has an interface. So in Java, and I believe in C sharp, you have an interface which has no implementation, and you have a class which has no interface, and classes can implement interfaces if they want to. In ParaSail every class has an interface, which is its external visibility. The only way you get to a class is through its interface. But you can say that I want this class or this interface to implement other interfaces. So you can define a new module, which is the name of a combination of an interface and a class, and I want this to implement this other interface as well, so it’s my own little super-amazing hash set. But it implements the more abstract notion of set. And you can implement interfaces whether or not they themselves have a default implementation. I mean, in Java you often see where there’s a nice interface and there’s a default implementation, which is sometimes called an adapter or various names for it. Here every interface can have a default implementation. If you don’t want to have an implementation for an interface you say abstract interface. So that just means there is no class. There is no default implementation. But you don’t have to. You can implement non-abstract interfaces elsewhere. So anyway, the other thing is that every module is parameterized. So there’s no notion of generic versus non-generic. Everything is parameterized. You can have no parameters, but the default is parameterization and that sort of right from the beginning sets your mind on saying most things will be parameterized. And it supports inheritance in the usual way. A type is an instance of a module. There is exactly one syntax for declaring a type, and that is by instantiating a module. All types have this syntax, including arrays, records, numerations, integers, you think it up, and it uses this syntax. So it indentifies the module that you’re instantiating, it gives the actual parameters, and this little new here is optional. And what that determines is if you say new, I want this to be considered a distinct type. I don’t care if the actuals and the modules are identical to some other instantiation over here, when I say new this is in module three terms, it’s branded. Its got its own identity and it allows me to make distinctions that might otherwise be blurred by the compiler. So if I declare type T is integer 1-100, that’s equivalent to every other time I say type t or type r is integer 1-100. But if I say type T is new integer 1-100, then that’s a different integer type. You can’t operate on them without conversion. >>: Is that a purely static notion? >> Tucker Taft: It’s a purely static. >>: So the representation would be the same? >> Tucker Taft: Yes. >>: Okay. >> Tucker Taft: And unlike, because of no pointers, there’s a pretty big distinction between an object that stores exactly one type, versus an object that stores a type or anything that looks like that. So if you want to have actual polymorphic objects, so you have an array of hashed sets, or you have an array of any old kind of set, those are considered different things. So if you want to have a polymorphic array, for example, or just a polymorphic object, then you add a little plus sign at the end of the name of the module and that becomes anything that implements that, or the end of the type, name of the type. So set+ would be, or set integer+, would be anything that implements the interface set instantiated with integers, and you can have an array of them and so on. Because objects can grow and shrink, there isn’t this problem where once you’ve created it it’s sort of locked into a particular type. You can store different kind of set into a polymorphic set type during its lifetime, and that’s just fine. Or you can set it to null. But because of this expandable storage model it means that you can do things that would be a little awkward or would somehow be inefficient in other languages. So a type is an instance of a module and an object is an instance of a type, and you can either have vars or constants, and you can initialize the vars. You must initialize the constants. If you say optional T here, then it default initializes to null. If you don’t say optional T then it’s not initialized from the point of view of the semantics, and you must initialize it before you use it. So that becomes another verification condition, if you will, to be proved. And that one is a little easier than most. So we’ve got module type object, and then operations, which are the things declared in modules that operate on objects. So the all the code lives in operations. So I said most of this. All modules are parameterized. All modules interfaced unless abstract, a class, blah, blah, blah. Okay, so in type use structural unless marked new. So if they have identical module, identical parameters, then they’re equivalent unless you say new. Here is just a little bit of syntax. We’ve already seen one before, but this shows you an interface and queens, the parameters and then here are some local types that have been declared. Here we have one. They use branding here to get a new integer type, which we’re going to use. It’s going to have its bounds minus eight times two to either times two or whatever; these are used for numbering the diagonals on a chessboard. This is for numbering the rows. This is for numbering the columns. And this is used for we’re trying to do the n-queens problem. One solution for the n-queens problem is a list of column numbers indexed by row, essentially saying for this row, where is the queen? So if the first row the queen is in column one, the second row the queen is in column three, the third row the queen is in column five and so on. So a solution to the nqueens problem is an array of columns. I wrote optional here just to sort of show the syntax, but also to show a post condition. So when you start off trying to come up with a solution you don’t have any queens on the board, so their column number is null. So what’s in row one? Null. No queen. What’s in row two? Where are they in row two? Null. So you start off where the empty or the incomplete solution has nulls for all of these guys. And then we define this function place queens, which returns you a vector of solutions. The intent is it returns you a vector of all possible solutions. And here is a post-condition where we’re saying that for all solutions of place queens, place queens when used here is talking about the result of calling it, so it sort of place queens results, for all of those solutions, for all of the columns of the solution, the column is not null. So all we’re claiming is that we’re returning a vector of these guys, but when you get it all of these column numbers are filled in. So that’s the promise. Does that make sense? So little bits of syntax here. Things that are assertion-ish, like constraints, we have got some constraints here on these row and column numbers, post conditions, preconditions, they all uniformly use this brace notation, sort of borrowed from hor [phonetic] logic, where preconditions would come before the arrow here, post conditions would come after, and then their type invariance, which appear kind of in the middle of class definition. But generally any place you see one of these guys that’s some sort of assertion-like thing. In terms of calling operations, you can either put the object out, the first parameter can be out front, sort of like a lot of object-oriented languages, or you can put all the parameters inside. If you need to identify the operator you can use the type double colon operator, but in general the name space is automatically searched, the name space of each parameter type. So if you’ve got x plus y, it’ll look for a plus operator in the module that defines the type of X and the module that defines the type of y. So you rarely have to qualify operators or any kind of operation name. It all pretty much finds it by doing [inaudible]. And it uses the context to do that so it’s pretty rich in its ability to find the operators. Yeah? >>: Back on the queen space. The capital N your type, is that something that’s static or can you actually instantiate these interfaces on different numeral than runtime? I’m thinking I want to write a program that lets the user tell me ->> Tucker Taft: This is kind of a weird way to do it. It would make more sense to pas sit as a parameter in runtime. F you do it here the requirement is that it ultimately be known, it doesn’t need to be known here obviously because we don’t know what it is, but when you instantiate a module all the parameters must be either other parameters of an enclosing module or be literal values. >>: The usual [inaudible] instantiation? >> Tucker Taft: Exactly. It’s almost more methodological. If you’re going to prove things sometimes you want to be able to expand them out. And so it’s sort of saying well, let’s not make the prover’s life too hard. There’s nothing really in the language that couldn’t’ deal with these dynamically since most the time you don’t know what they are. It’s more just overall there’s kind of a guarantee that if you go back to the very initial instantiation you’ll know everything, which is sort of nice in some cases but it’s really more of a methodological thing. In some ways that’s the difference between the parameters to a module and the parameters to an operation. Okay, the other thing that’s a little different is that there are five types that each correspond to each kind of literal. So unlike in C where you’ve got to say, well, this is a long integer literal or a very long float-integer or something, there’s just one integer literal and it’s universal, you know? It’s infinite precision, same thing with real. Universal string is an arbitrary long vector of universal characters. Universal characters are Unicode, you know, you get clingon and everything else you could imagine. And then the one that’s a little odd here is universal enumeration type. It’s not odd if you’re familiar with how things in list work, but essentially we’ve given them a distinct syntax. So when you want to define enumeration type, you’re not introducing new literals. The literals are out there. You’re saying that these literals are legal values of my type, or can be converted to my type. The way literals work is that to allow to use a literal with a given type, you have to declare the from_univ operation. And that from_univ operation is a conversion from one of these universal literal types to your type. And so if you declare one of these from_univ operators then you get to use those literals. That’s the rule. And the precondition on that conversion determines what subset of those literals you can use with this type. So if it were a 32-bit integer, then the precondition on the from_univ would be it’s got to be in the range minus two to the 31st to plus two to the 31st minus one, or something. If it’s an enumeration type the precondition is critical, it identifies which enumeration literals can be used with this type. So if it’s red-green-blue, then that precondition says that it can only be red, green, or blue. Does that make sense? And if you want to do conventional 8-bit characters, well then the character code would have to be in the range for that purpose. So I talked a little bit about the generalized indexing. Basically all these different kinds of things can all be seen as either kind of key as value or key goes to I’m here, kind of like a set. And they’re homogeneous at compile time at least. They might be conceivably polymorphic at runtime, but from a compiler point of view they all look the same. And you want to have things like iterators [phonetic], indexing, slicing, combining, merging, concatenating and so on. You might like a representation for empty containers, some way of creating a literal container; they can grow and shrink over time -- automatic storage management. So the idea is there is some nice syntax I would like to be able to use with my type. And the same way that if you define that from_univ operator you get literals, if you define a certain set of particular operators like indexing and the empty bracket and or equals and a few others, then you can use the nice syntax. And here’s an example of the syntax container sub indexes for indexing, containers sub a dot dot b for slicing, empty brackets is for empty container. You can do literals both positional and named literals and so on. And these just are syntactic sugar for a call on one or more operators. Building up a thing like this it calls the empty container operator then it calls something to add more elements into the container. The compiler generates those calls for you and then if it finds a bunch of calls, which depend only on values that are known at compile time, then it essentially evaluates them at compile time and replaces it with a reference to that constant. So it does make these operations to implement things like containers, but if it’s one, two, three it will actually do the computation at compile time to evaluate that aggregate. Okay. So the whole point of a lot of this was to make it possible to have pervasive parallelism, and I’ve already talked about why. The underlying model is there are thousands of threads, it’s too many to worry about individually, so it’s were just going to use them like water, use them like registers, use them like virtual memory. They’re just a resource; they’re not something that you should be getting too hung up about individually. Your goal is to use as many as possible and the language should make that easy, and it should also prevent some of these nasty situations automatically. I think if you’re familiar with how register allocation works, it’s actually a pretty similar problem. You’re trying to avoid, you know, you’re trying to use cores in the same way you’re trying to use registers, trying to avoid stepping on using the same register for two different variables that have overlapping lives and things like that. Parallel by default, syntactically F of X plus G of Y -- Tennis! Well, I’m missing my tennis date, how bad is that? So F of X plus G of Y is evaluated in parallel, or at least it can be. The compiler decides whether to, but it doesn’t do any work, but sees F of X plus G of Y, it will say, well, that can be evaluated in parallel. In fact, you can’t write something that cannot be evaluated in parallel that would be illegal. You can’t say F of X plus G of X, if F of X and G of X can both update, or if either one of them can update X, that would be illegal in ParaSail. So everything you’re allowed to write can be evaluated in parallel. The semicolon kind of is a separation between statements. If you really want to say I want to do all these things in parallel, you can put double vertical bars, and then you’re limited to the same rules that you can’t write that unless they can be safely evaluated in parallel. If you can use a semicolon, then that says, well, if you can do it in parallel please do, if you can’t, then don’t complain. And then there’s another operator, which says then, process x, then process y, then process z, and that says don’t do them in parallel. You may think it’s possible but don’t do it. Of course the compiler can always say, well, I’m so smart I can do it anyway, but that’s not recommended. >>: So it’s up to the programmer to still sort of figure out sort of, I mean, there’s this whole work span in sort of communication purposes for [inaudible]. Are you leaving that up to the [inaudible]? If you [inaudible] parallelism then you’re going to be slower than sequential, I mean, that’s just -Because there’s a lot of experienced people taking pure functional languages where you have amazing amounts of parallelism, but the overheads really kill you and it’s still very hard to sort of do the right [inaudible]. >> Tucker Taft: Yeah. I sort of thought my time was more time, sorry. Yes, there are. It’s clearly tuning involved in coming up with the right granularity there, and right now its heuristic is very simple. It must involve at least one out of line call if it’s going to, you know, if it’s F of X plus G of Y and those are both out of line calls, then it will create Pico threads. But these are not actually run on different CPUs, they’re simply candidates for being run. And it usually works stealing model, if you’re familiar with that model, and so whether that’s too small, that’s a good question. The intent is to make it such low overhead, these threads don’t have any context, and they’re really minimal overhead. But you know, you’re not going to do X plus Y, that’s the idea. So I should probably wrap up here so I don’t blow out completely. Somehow I was thinking I was going for five o’clock, but 4:30 is where I was headed. So there’s a lot more to say, but you know there’s always another time to say it. But I’m happy to answer more questions if you have some at this point. Okay, thank you. [applause]