>> Rustan Leino: Good afternoon everyone. I’m... introduce Tucker Taft who will talk to us about his...

advertisement
>> Rustan Leino: Good afternoon everyone. I’m Rustan Leino. It’s my pleasure to
introduce Tucker Taft who will talk to us about his new language design. So Tucker
has been involved in languages and different program analysis techniques for
decades, starting at Intermetrix in Boston in the 80’s and 90’s and then doing his
own company SoftCheck in 2002, which more recently was acquired by ADA Corps,
which I guess seems to have collected a number of companies.
So today he’ll tell us not about ADA, even though he’s been quite involved in ADA 95
and 2012, but about ParaSail. So, welcome.
>> Tucker Taft: Thank you. So I thought I might also say, you know, why am I here?
Well, I must Rustan at a very nice little conference in Maine of all things, and I
brought along my Maine hat just incase there was any doubt that I’d been to Maine.
And anyway, it was I think we sat in on each others’ presentations and there was a
tremendous amount of overlap on what we were having to say about what we
believed in and what we were trying to accomplish and so on. And so Daphne and
ParaSail I think are trying to address some of the same issues.
And this morning I was walking over here from my nice little B and B and I was
thinking, well, let’s see, ParaSail plus Daphne -- ParaDaphne? ParaDucks. So that
was my official name for the attempt to combine these two. Anyway, this is about
ParaSail, not about ParaDucks, but I think you will see some similarities and overlap
at least if we get into the stuff about assertions.
I’m going to start talking to you more about the parallelization attempt and -- So I
set out a while ago to start designing a new language. I’ve been involved with ADA
since 1979, roughly, which is before ADA was even a standard. And got very
involved pretty soon there after. But in 1990 we won the contract to do the first
revision of ADA, ADA ’95 as it’s now called.
At the time it was ADA 9x because we didn’t know what it would be. And I was the
lead technical person on that. But well before that, in the 70’s, I started getting
interested in programming language design. I don’t know why, but it was sort of
one of those things, you know, that hits you and I’ve always been interested in it
and, as I said, about three or four years interested in starting over. Let’s just blank
sheet of paper, what could we do if we really wanted to start from scratch?
And one of the reasons to do that was this whole multicore thing. That is we’re now
dealing with a world where the number of processors is going to be growing or
doubling, perhaps every two years. We used to be having twice as fast processors
but now it’s going to be twice as many and that’s a fundamentally different problem
to solve.
In the old days we could sit on our duffs and our software would get twice as fast,
now we actually have to get a little more clever to take advantage of all those extra
processors.
So anyway, this language is called ParaSail, for Parallel specification and
implementation language. It’s intended to be a very formal-oriented language. It
has built in assertion syntax and so on, and pervasively parallel. That is, it’s easier
to do things in parallel than sequentially in some ways.
Starting from scratch doesn’t really happen. So I have to admit that I studied a few
languages in my time, and you will probably see plenty of them coming through on
some of this, but there are a few things that I think are unusual. And it may be the
combination, which is perhaps unique.
Anyway, and this is a little bit of insult on C and C++, which I’m probably insulting at
least some of you here. But my belief is that C and C++ are two of the least safe
languages designed in the last forty years, and they are winning in the safety-critical
world.
I’m very involved with Safety-critical systems and it used to be almost 80-20, 80
percent were writing in ADA and 20 percent were written in C, but as the number of
Safety-Critical systems has grown and they’ve sort of branched out from the military
industrial complex into sort of just commercial world, ADA has not followed that
path.
And so now instead of 80-20 ADA to C, it’s more like 80-20 C to ADA. And C++ is
used not at the highest level of criticality, but it’s certainly growing in use in some of
the safety-critical systems. So that’s kind of, you know, I understand why ADA did
not make the jump.
Is that me? I don’t know who that is. It’s not me.
Anyway I don’t think that’ll bother us too much. What’s the beeping?
>>: It’s the battery back up actually.
>> Tucker Taft: Battery back up? Is that for my Pacemaker? You heard about the
guy who hacked his own pacemaker? That’s happened recently. That’s kind of cool.
That’s just to prove that you can hack into anything I guess, especially if you’re a
little crazy and you have a pacemaker.
Anyway, so one of the things that I think most of us in this room recognize but not a
lot of people elsewhere recognize, the computers actually stopped getting faster in
about 2005, which is longer ago then I would have guessed, but my laptop is looking
a little old here. But it’s still three gigahertz and you go out and buy a laptop today
it’s kind of still three gigahertz, you know?
And if things had kept doubling, which is what they were doing for 25 years or so,
this would be like in the 50-gigahertz zone, if I got a new one that is. Anyway chips
are going to start having more and more cores.
The other thing that I think has happened is static analysis, and that’s represented
well within this group, has actually come of age. And it’s time to get it into the
language, you know, stop beating around the bush where you have a separate tool
or whatever, you know? Let’s get into the language.
This is just a slightly enlarged version of that IEEE slide. And it’s pretty dramatic
what happened in 2005 where they just started cooking their chips and they
decided that was not a good idea and maybe we should try to figure out how to
make them higher performance in other ways, and that’s where all this multicore
stuff started.
It’s interesting to go back and look at articles written in that zone, 2004, 2005, 2006,
where they realized that things really were going to change. Intel calls this their
right turn, because this is a huge deal for them. They were having no trouble selling
a new chip every couple of years to people with laptops when it was twice as fast,
you know?
You got a 1-gigahertz? My goodness, come on! 3-gigahertz is what it’s all about!
But now it’s kind of hard to convince people to say, oh you got to have a new laptop
because it’s sort of prettier, or, you know, thinner or something. Anyway, I
mentioned I drew from various languages.
IML was certainly a source of much inspiration. ADA certainly. Cyclone, I don’t
know if you know Greg Morissett but this is something he worked on with regionbased storage management. CUDA, Cilk, a lot from Cilk, anyway a lot of these
influences will show up probably.
So I had this idea, new programming language, and I knew I wanted to have lots of
good parallelism and lots of good formalism and I really knew that I was going to
start adding things that it had to start from a pretty simple core. So I start ripping
things out from languages that I loved to try to get down to the absolute bare
minimum.
And I think if you can do that that’s generally a good thing. If you’re trying to do
parallelism formalism, well one thing was to make conceptual room in people’s
minds. You know, if the language is already complex and then you start adding
more into it it’s going to be really hard to deal with.
But the other reason was I wanted to be pervasively parallel and safe and easy to
verify. Maybe easy to verify is an overstatement. Straightforward, that’s my usual
word for when I can’t say easy, but easier to verify. And getting rid of things is
definitely a good place to start when you’re trying to do that. Parallelization,
parallel by default, every expression is parallelizable. So if you write F of X plus G of
Y, F of X and G of Y can be evaluated in parallel and the programmer doesn’t have to
worry about it. It happens automatically. If you look at languages, if you’re familiar
with Cilk, it’s a language that was done it came out of MIT -- Charles Lizerson
[phonetic] I believe, and then was bought by Intel and they’ve put more energy into
it. It has very easy parallelization but you’ve got to say spawn.
If you want an expression evaluated in parallel you say spawn this or spawn that,
and it’s caveat [inaudible]. If you say that, they’re assuming well then it must be safe
to do that. There’s no real guarantee that that’s a safe thing to do.
I wanted a language where there wasn’t that issue that the parallelization was
always safe, and if it weren’t, the compiler would catch it. And make it so easy and
sort of baked in that it’s easier to write parallel than sequential. Because I think if
we get our programmers to all write it this way we have to make it really easy.
And then from the formalization point of view there’s been a growing acceptance of
pre and post-conditions, type invariance, that sort of thing, getting them right into
the program, and in my view the compiler ought to be the one that’s enforcing these.
Now, it may be a smarter compiler than your average bare, maybe it has Boogey and
Z3, even its back pocket. But I think from the programmers’ point of view it ought to
be part of the same tool.
Every time you compile it that gets checked. And if you can do all that and compile a
time, then do we really need runtime exceptions, the overhead runtime exception
handling and the conceptual overhead of dealing with exceptions, which make
thoroughly testing a system much harder?
No. So here’s my little list of simplifications. And actually it is more than just these,
but this is already enough just to have something to chew on. So no pointers. That
actually kind of came last. It was not where I started at all. I knew global variables
were going to be trouble.
If you’re trying to parallelize f of x plus g of y and you’re not allowed to look inside
them, you’re just looking at their spec, well, either you’ve got to be very explicitly
about what modifies what, which is a pain in the neck for programmers to maintain,
or you’ve got to just say well, no global variables.
Why do we need them? Well, you know, we’re used to them or it’s painful to pass
around parameters and so on. But I was willing to give it a shot anyway. No hidden
aliasing. That is when you get into procedure, if you’ve got two parameters, X and Y,
you can rely on that fact and one or the other of them you can do an update on, you
can rely that their not the same parameters.
So if there’s any aliasing it’s things like A sub I versus A sub J, those might be the
same, and that’s determined by whether I equals J. But X versus Y? Well, there’s no
way those could be aliased unless you pass them as a parameter, for example, or one
was a global and one wasn’t. So getting rid of that sort of aliasing, and then various
other simplifications.
This one is I think relevant to people thinking about parallelization. A lot of
languages have explicit threads or are adding them, as C++ just did recently, and I
think that’s not the way to achieve massive parallelization or pervasive
parallelization. It’s like asking programmers in the old days in C you had to say I
want this variable in a register.
So you’d name like, you could pick three of your local variables, and you could say
register, and then the compiler would, you know, yeah, okay, that’ll be register
three, that’ll be four, and that’ll be five. And at some point we said, you know, that’s
not really the programmers’ concern. There are too many registers. We got 16, now
we got 32 registers. I can’t have them trying to figure out which one of these local
variables should be in registers.
And it’s probably some of the temporaries that should be in registers. So as the
number of cores begins to grow, expecting the programmer to say, well, I think this
should be a thread and that should be a thread and, you know, one core for all this
code and one core for all this code -- well, they’re too many. Maybe there are a
hundred threads -- or a hundred cores. It just gets out of hand.
Same thing with locking and unlocking, you know? If you have to kind of worry
about, oh jeez, I guess I better lock because might there be two threads, that’s just
not going to work. And again the idea of waiting and signaling doing it explicitly
where okay I’m waiting on this condition variable and hoping somebody is going to
signal me at some point, and I hope they don’t do it because of the right reason and
so on. Trying to get rid of some of those things.
No race conditions. That is that compiler insures that there are no undesirable
simultaneous access to shared data. If that is possible it would either complain or in
some cases perhaps it would insert additional synchronization, but the general rule
is that the language is designed so that the compiler can complain about any such
possibilities.
So why those simplifications, and in particular, why pointer free? I mention F of X
plus G of Y, and a lot of it comes down to thinking about that problem. We’re just
trying to safely evaluate F of X and G of Y in parallel, without looking inside of F and
G. So what does that mean?
Well, let’s assume x and y were parameters coming in to us, and now we’re calling
off F of X plus G of Y. Well, we probably ought to be able to say that those aren’t
aliased. So no global variables, clearly helpful if f or g both manipulate the same
global variable they might be stepping on the same object.
No parameter aliasing is important, which I already said several times. And what do
we do if X and Y are pointers? Well, I’m not even sure what that means any more. If
I’m saying no global variables, what can they do with a pointer? Well, is d
referencing a pointer referencing a global variable? Depends on your model of how
global variables work, but let’s assume it’s not. We have this fundamental problem.
Can I get X and Y, is there a danger that f or g could follow the pointers I give it and
ultimately arrive at some common object, Z? And the answer is without a lot of
analysis, that’s probably quite possible.
You don’t really know where pointers ultimately lead. You can do various kinds of
static analysis to prove that X and Y are separate, you know, separation logic and so
on, but for not expecting the programmers to do a lot of annotations that are some
how going to allow you to prove that, we’re not allowed to look inside f or g.
Pointers make life a lot harder.
Furthermore, if we have parameter modes like in-out, or in, or var versus non-var,
how does that relate to the things that they point at? In a language like ADA, which
has parameter modes. You can declare pointers that only give you read-only access
to the object they designate. But then it only ever gives you read-only access.
If you pass a pointer as an end parameter and it’s an access to a variable, then the
caller can update that variable. So parameter modes don’t really help very much
here. You’ve got to be able to know well, what are they going to be able to do that
they reach through levels of indirection. So after almost a year or two of doing
language design I finally arrived at the answer. Well, let’s just get rid of pointers
completely.
And I started trying to write code that didn’t use pointers. Pretty early on I realized
that I needed to represent things like trees without pointers, and the way to do that
is essentially to add to every type and additional value. This is like maybe types in
many functional languages and so on, and pointers for that matter have null values.
But don’t think of null as a pointer just think of it as like a very small value.
Or it might be equivalent to sort of uninitialized, but it’s got a name and you can
check for it, so it’s not quite as bad as just a random bit pattern. Anyway, so for
every type, be it a record type, or a ray type, or an integer type, or a Boolean type,
we’re going to add an extra value to that type. So it has all its usual values, true,
false, and null. And you can use null when you declare an object or a component or
parameter to be optional.
And that basically means, okay, that might be a null value. If you don’t say optional
it can’t be. It’s got to be one of the normal values of the type. But if you say optional,
then that allows you to put null into that object. And my conclusion was that a lot of
uses of pointers are for that purpose. That is that purpose of starting off with
nothing and then putting something there. And a pointer is sort of the only way to
do that in many existing languages.
But if you just sort of say, well, let’s add another value called null so I can start off
with nothing and then I can put a tree there. Or I can start off with nothing and put
in a ray there. And that’s one of the big uses for pointers. Now, if you’re not going to
have pointers that assignment had better not be pointer assignment because then
suddenly you’ve got pointers again.
So assignment is going to be by copy. But that’s kind of painful. Do we always want
to do it that way? Suppose I’ve got a big tree here and a big object I’m trying to put
into that tree, or I’m trying to put an object into a set or into some other kind of
container.
I don’t want to make a big copy of it and then delete this, so why don’t I just actually
be able to move it in? So if you’re not going to have pointers, then you may need
additional operators that are similar to assignment but are for the purpose of
fundamentally moving a piece of data into a container.
Or swap is the same thing. If assignment does copying, you’re trying to, you know,
switch two halves of a tree to do balancing or something. If I have to do it by
assigning to a temporary that makes a copy then I assign this over to here and that
makes a copy, then I assign over here and that makes another copy, that’s a lot of
copying.
If I can just swap the left and right halves of a tree, then that’s much nicer. So that
gets you a big portion of the use of pointers, and then let’s generalize the notion of
arrays. Fortran had arrays it didn’t have pointers. It didn’t have records for that
matter when it started off. And people managed to survive, but it was painful and
pretty quickly we got away from that if we could.
But if you have containers that allow the user to define what indexing means, so it
might be a hash table or it might be a set or it might be an array, it might be some
kind of expandable vector, but it’s got the notion of an index so you can I want to get
the nth element, or I want to change the nth element, or maybe it’s the element
whose key is ABC.
There’s a general notion of an index or a key, and there’s a general notion of a
container. And if we allow the programmer to implement these sort of containers
and then use indexing syntax to refer to them, perhaps we don’t’ really need
pointers even for cyclic data structures.
We know for tree-like structures we can do without them, but for cyclic data
structures it’s pretty hard to do without them unless you have some alternative.
And so let’s just make indexing very general. The individual elements in your
container can grow and shrink because they can be null and they can become nonnull values.
So we can write things like for each N in directed graphs of I dot successors loop,
and the successors are not pointers, they are a set of node IDs and we are now
iterating through all of the successors of some graph and we can add edges to the
graph, we can add nodes to the graph and so on, but we don’t really need pointers
for that purpose.
In fact, in my experience, when I started looking at code I’d written in the past using
a language that had pointers to do directed graphs, I frequently didn’t use the
pointers to do the directory graphs. I frequently did it this way anyway for other
reasons, like the nodes often have existence independent of whose pointing at them.
You create a graph with a bunch of nodes and then maybe you add edges and
remove edges and so on, but you don’t want the nodes to just disappear because no
one’s pointing at them because they have some existence independent of the edges
connecting them.
So it’s not obvious that pointers were the right solution for a directive graph
anyway. So my hope was, well, given this I can do anything that I could have done
with pointers. And with this additional stuff I minimize the actual need for them.
The other thing that came along with this was objects are now sort of self-contained.
That is, they grow and shrink, but they don’t dangle, if you will. You can’t have a
dangling reference to an object because there are no references at all. Objects live
conceptually in a stack frame where they are declared and they can grow and
shrink, but no one’s going to have appointed them that might outlive their lifetime.
So that allows you to change your whole storage management approach. So no
global heap because you don’t need to create objects in places where they can live
indefinitely. They can only live as long as the scope in which they exist. So we now
move all of our storage management to the local region-based storage where they
objects are local and there’s essentially a local heap, which is essentially what a
region is.
And their growth and their shrinkage are all within that local heap.
>>: So [inaudible]. So you’re not going to really have the heap in the sense, well,
you’re not going to have cycles, really explicit cycles, because you don’t have
pointers? But you’re going to simulate that because you’re going to say, well, you
know, the index of this thing is over there in that collection in index J. You’re going
to ->> Tucker Taft: Well, index is just a value. It’s just a key of some type. You can use
it to some degree as you see fit. It’s a scalar value. It’s a string, you know, and you
can store it in other -- I mean here I’m imagining I’ve got a set of those [inaudible]
stored in the successor’s set, but I could have come up with, I mean, where did I
come from?
Maybe it was passed in as a parameter, maybe it was the resulting ->>: [inaudible] out of bounds? You said you had no runtime exception. So you still
have the notion of a data structure with some bound to some extent.
>> Tucker Taft: Right. So there are preconditions on using [inaudible] under
certain circumstances. And, for example, it might be that it’s got to be in the range
1-100 if it’s a hundred element array. And so those preconditions are things that
have to be actually checked at compile time, and you have to prove that you’re
within the range. Or the alternative is that you might have a data structure like a
map where it returns a null if there’s no value at that index.
So those are the two ways you would deal with that. Does that answer your
question? Okay. Other questions?
And one thing that you often, that you do bump in to, is you now have got a tree let’s
say, and it’s maybe fairly elaborate, and you’ve now asked to have this sub-sub-sub
component replaced. It’s currently null, so it’s sort of like a little stub on the end of a
tree somewhere. I’m now going to put a branch there.
And let’s suppose I do this by passing it as a parameter as a var or a parameter, that
particular stub I want you to stuff something into it. Well, currently it’s null, and I
want to be sure that you put that new piece of the tree in the right region. So either I
have to pass in to every procedure an additional indicator of what region I want you
to put things in if you allocate them and add them in my tree, or the tree itself has to
identify where should that addition go.
Does that make sense? You’re trying to make certain that all of the pieces of this
tree live in the same region. And I’ve now asked you to plug something into this null
part of my tree, put a non-null piece into that. So what we do is we make each null
identify what region it is for.
So null is not just the value zero, it actually identifies, it says I’m null, but by the way,
if you want to replace me with something non-null you’ve got to put that in region Q,
or example. Yeah?
>>: It seems a little odd though, if you have a tree with holes like that that are
associated with particular storing regions, what happens when you lose your copy
operator to copy the whole tree into some other region and now these guys are still
talking about the old one? You got a whole bunch of fix-up to do.
>> Tucker Taft: Correct. When you create a copy the null and the new value
identifies the new region in which it’s living.
>>: So although in the original tree they could be referring to regions other than the
tree’s ->> Tucker Taft: No. Okay, they’re all referring to the same region that is the tree ->>: So there’s some notion of parentage that a null carries the region of a container
it’s living in.
>> Tucker Taft: Exactly. And it’s the place where that object was declared that fully
determines everything about all the storage within it. Now, it is possible to have
what you would probably call a pointer.
Short-lived references to existing objects, for example, when you pass a parameter
we don’t make a copy, we just pass a reference to the object that you’re referring to.
But it is a hand-off if the object is being updated by that function you’re calling.
When you pass it you are essentially logically handing it off, and no one else is
allowed to update it while it’s being manipulated by that function. And you can also
return references, but you can only return references to things that are passed to
you. And that’s how indexing is implemented. You have to be able to allow users to
implement their own indexing. What can they do?
You pass them a map; they can pass you back an index to an element of that map. I
mean a reference. So trees. I’ve already talked about this but this is a little more
specific. How is a tree represented?
So here is an example. In ParaSail, every module has an interface and a class that
implements it. The interface gives you the external view and the class gives you the
implementation. This is an external view of a non-opaque type. We’re essentially
exporting how is this type represented?
But you could put this all in the hidden part of the class itself and then just provide
operations for, you know, sending and getting these things. But at this point this is
just indicating how would you represent a tree or a node of a tree in ParaSail?
If a payload, which is of the type of the parameterized type here, and left and right,
which are optional trees default initialized to null. So that’s what a node of a tree
looks like. And then they grow by having trees plugged into the left or right
component. Does that make sense?
So here are a few little examples of creating a tree. We declare a variable root of
type tree node over strings, and we initialize it payload, with the payload having the
value top, and left and right are defaulting to null. Then we can assign into the left
tree a new tree node with payload L and with the right tree being payload LR. And
conceptually these are values, as opposed to things that are pointed at.
These are just values whose size is more dynamic than the average value, but
conceptually it’s a value semantics. And then if I wanted to move some part of the
tree from one place to another I can use this move operator, which has a side effect
of leaving the right hand side null. So essentially, at a semantic level it’s copying the
right-hand side into the left-hand side and setting the right-hand side to null.
If these two happen to be from the same region, you can do that without physically
copying. You can just essentially move the underlying pointers. But from a
semantic point of view, this is assignment followed by setting the right-hand side to
null. Does that make sense?
Okay. Anyway, a little bit about region-based storage management. Cyclone, this is
Greg Morissett and a few others -- Yeah?
>>: Just on the previous assignment. [inaudible] that you did it the other way
around? Root dot left dot right gets root dot right. If you then first do the move and
then put null in the root dot right, you’ll cut off the thing that you just moved. So
does one have to be contained with [inaudible]? Do you have to make sure that one
is not contained in the other when you do the move?
>> Tucker Taft: No, it is smart enough to not lose the data. So you can, for example,
remove an element from a link list by replacing the link list with its second element,
essentially, and it will essentially then remove the intermediate element. Does that
make sense? Okay.
So region-based storage management. You may be familiar with it. This is a little bit
of a twist on it in that in Cyclone there are pointers, but the pointers have an implicit
region associated with them. And so what they really, you can think of them as
being [inaudible] into that region. They’re not general-purpose pointers, they can’t
point arbitrarily in arbitrary places, essentially they can give you a reference into
that region.
But the other thing about it is that regions live in a certain scope and they’re local to
those scopes. And when you leave the scope you reclaim the whole thing, which
simplifies at least storage deallocation [phonetic]. So we’re getting the same
advantage of that, but it’s simplified even further because there’s only one pointer to
every piece of data. And so when that pointer is set to null you can immediately
reclaim that storage.
So you have these local regions, which are essentially local heaps, and the space for
an object declared on that scope is allocated from that region. And when this object
shrinks because you have a non-null part of it you set to null, that is immediately
returned to this region.
So the region has its own little storage manager, if you will, and it’s very simple
initially if there haven’t been any things returned. But if you start returning stuff
then it’ll do the usual list of reusable pieces or something. But then when you leave
the scope you just flush the whole thing because you know all the objects that are
living in that region were declared in the same scope.
They can’t be from an outer scope. And as I mentioned, move and swap are very
cheap when you’re staying within the region. Analogous, if you think about a file
system like on Unix maybe or these days a USB card, and I can move a file within the
USB card and it’s very fast. Oh, that’s no problem, it’s just sort of changing some
pointers and so on.
If I move it onto my laptop it’s actually got to copy it, same idea here. Each region is
like a separate disk, if you want to think about it that way. In the Unix world a
separate drive. And if you move within the same region it’s very cheap, or swap, if
you move across then there is some actual copying.
To help in this process is if you know you’re creating an object that is going to be
moved into a particular container, so I’m creating this object, I’m going to move it
into this tree, or I’m creating this object and I’m going to move it into this set, then
when you declare it you can say it’s being created for the purpose of becoming part
of that object.
And so here it will declare this object and allocate its space out of this region, which
you know will outlive this object. So essentially this is like a late declaration of an
object that’s going to be living in an outer region. And then at some point we can
move X into the root object here, which is what I’m showing. Root dot left dot
payload, we move X into that and now we know that’ll be cheap.
This is just a hint. It can be completely ignored, it wouldn’t change the semantics
but it affects the performance. Does that make sense?
The other thing about using local heaps is that they’re generally better behaved for
parallelism because now objects are generally not scattered all over the global heap
indicating at what point they grew. And if you’re adding something to an object and
then quite a while later you add another thing to the object and then another thing,
then the object is spread all over your global heap.
On the other hand, each object lives in its own region, then you know that the space
you’re allocating for it is going to be coming out of that same region. The other thing
is that when you have multiple threads that are all allocating space in parallel, then
they’re generally each working in a separate region when they’re doing that if you’re
doing region.
If you’re doing global heap they’re probably allocating out of the same global heap,
and you end up with unrelated threads that are allocating storage right next to each
other. And if they’re going to be updating that storage that’s really bad news
because you’re going to get into cache false-sharing type things where they’re
essentially knocking each others’ data out of the cache.
So region-based storage management is essentially better behaved when you start
getting into multiple cores. Yeah?
>>: So what do you do about [inaudible] state? If I have a long-running loop that’s
carried state from each region to the next and I have to declare a region that
encloses the entire loop in order for the object to be live, right? In which case I
would [inaudible].
>> Tucker Taft: Loops are implemented by having each iteration be its own thread,
and when you start another iteration you can pass it some parameters, and those
might be what it needs to do the next iteration, for example.
>>: So each iteration loop is a separate thread. So how do you do loop [inaudible]
state?
>>: Tucker Taft: At the end of one iteration can pass a parameter to the next. I
don’t know if that’s answering your question.
>>: In which case you wouldn’t be able to, I guess you could reuse a thread package.
>> Tucker Taft: Yes. The thread is now usable. I’d have to see an example to exactly
answer your ->>: So the thing in Cyclone where one thing was a problem was the regions were
not sufficient to be able to write [inaudible] programs. You ended up having to do
things like having linear types and so on so that you could escape the less
[inaudible] of a region-based storage management system, so that you could do
things like [inaudible]-state.
Because otherwise you would just run out of memory if you were just sitting in a
large lop and you, you know -- you would never [inaudible] region.
>> Tucker Taft: Well, why is the loop-carried state growing? That’s what I’m not
sure about your ->>: Everything that was, I mean, it may grow in some cases, but it will all be out into
a region that contained the entire loop. So you wouldn’t free that region until the
entire loop exited.
>> Tucker Taft: Yeah. I guess I would have to see the example because each
iteration loop has its own region potentially, if it needs one. And then you have the
outer region. But the loop-carried state is, I mean, I’m not certain what makes it
grow. I’m trying to think of examples where it would be growing and that would be
helpful to see.
I do have some examples of loops and maybe in there you’ll be able to say oh, a loopcarried state and that’s going to kill you, so maybe you can save it for those
questions.
Okay. So I’ve been kind of diving down into the underlying guts of ParaSail and no
pointers and so on, but at a higher level, what does it look like? Well, it’s an objectoriented parallel programming language.
It has interfaces and classes. One thing a little different about it is every class has an
interface. So in Java, and I believe in C sharp, you have an interface which has no
implementation, and you have a class which has no interface, and classes can
implement interfaces if they want to.
In ParaSail every class has an interface, which is its external visibility. The only way
you get to a class is through its interface. But you can say that I want this class or
this interface to implement other interfaces. So you can define a new module, which
is the name of a combination of an interface and a class, and I want this to
implement this other interface as well, so it’s my own little super-amazing hash set.
But it implements the more abstract notion of set. And you can implement
interfaces whether or not they themselves have a default implementation. I mean,
in Java you often see where there’s a nice interface and there’s a default
implementation, which is sometimes called an adapter or various names for it.
Here every interface can have a default implementation. If you don’t want to have
an implementation for an interface you say abstract interface. So that just means
there is no class. There is no default implementation. But you don’t have to. You
can implement non-abstract interfaces elsewhere.
So anyway, the other thing is that every module is parameterized. So there’s no
notion of generic versus non-generic. Everything is parameterized. You can have no
parameters, but the default is parameterization and that sort of right from the
beginning sets your mind on saying most things will be parameterized.
And it supports inheritance in the usual way. A type is an instance of a module.
There is exactly one syntax for declaring a type, and that is by instantiating a
module. All types have this syntax, including arrays, records, numerations, integers,
you think it up, and it uses this syntax.
So it indentifies the module that you’re instantiating, it gives the actual parameters,
and this little new here is optional. And what that determines is if you say new, I
want this to be considered a distinct type. I don’t care if the actuals and the modules
are identical to some other instantiation over here, when I say new this is in module
three terms, it’s branded.
Its got its own identity and it allows me to make distinctions that might otherwise
be blurred by the compiler. So if I declare type T is integer 1-100, that’s equivalent
to every other time I say type t or type r is integer 1-100. But if I say type T is new
integer 1-100, then that’s a different integer type. You can’t operate on them
without conversion.
>>: Is that a purely static notion?
>> Tucker Taft: It’s a purely static.
>>: So the representation would be the same?
>> Tucker Taft: Yes.
>>: Okay.
>> Tucker Taft: And unlike, because of no pointers, there’s a pretty big distinction
between an object that stores exactly one type, versus an object that stores a type or
anything that looks like that. So if you want to have actual polymorphic objects, so
you have an array of hashed sets, or you have an array of any old kind of set, those
are considered different things.
So if you want to have a polymorphic array, for example, or just a polymorphic
object, then you add a little plus sign at the end of the name of the module and that
becomes anything that implements that, or the end of the type, name of the type. So
set+ would be, or set integer+, would be anything that implements the interface set
instantiated with integers, and you can have an array of them and so on.
Because objects can grow and shrink, there isn’t this problem where once you’ve
created it it’s sort of locked into a particular type. You can store different kind of set
into a polymorphic set type during its lifetime, and that’s just fine. Or you can set it
to null. But because of this expandable storage model it means that you can do
things that would be a little awkward or would somehow be inefficient in other
languages.
So a type is an instance of a module and an object is an instance of a type, and you
can either have vars or constants, and you can initialize the vars. You must initialize
the constants. If you say optional T here, then it default initializes to null. If you
don’t say optional T then it’s not initialized from the point of view of the semantics,
and you must initialize it before you use it.
So that becomes another verification condition, if you will, to be proved. And that
one is a little easier than most. So we’ve got module type object, and then
operations, which are the things declared in modules that operate on objects.
So the all the code lives in operations. So I said most of this. All modules are
parameterized. All modules interfaced unless abstract, a class, blah, blah, blah.
Okay, so in type use structural unless marked new. So if they have identical module,
identical parameters, then they’re equivalent unless you say new.
Here is just a little bit of syntax. We’ve already seen one before, but this shows you
an interface and queens, the parameters and then here are some local types that
have been declared. Here we have one. They use branding here to get a new integer
type, which we’re going to use.
It’s going to have its bounds minus eight times two to either times two or whatever;
these are used for numbering the diagonals on a chessboard. This is for numbering
the rows. This is for numbering the columns. And this is used for we’re trying to do
the n-queens problem. One solution for the n-queens problem is a list of column
numbers indexed by row, essentially saying for this row, where is the queen?
So if the first row the queen is in column one, the second row the queen is in column
three, the third row the queen is in column five and so on. So a solution to the nqueens problem is an array of columns.
I wrote optional here just to sort of show the syntax, but also to show a post
condition. So when you start off trying to come up with a solution you don’t have
any queens on the board, so their column number is null. So what’s in row one?
Null. No queen. What’s in row two? Where are they in row two? Null.
So you start off where the empty or the incomplete solution has nulls for all of these
guys. And then we define this function place queens, which returns you a vector of
solutions. The intent is it returns you a vector of all possible solutions. And here is a
post-condition where we’re saying that for all solutions of place queens, place
queens when used here is talking about the result of calling it, so it sort of place
queens results, for all of those solutions, for all of the columns of the solution, the
column is not null.
So all we’re claiming is that we’re returning a vector of these guys, but when you get
it all of these column numbers are filled in. So that’s the promise. Does that make
sense? So little bits of syntax here.
Things that are assertion-ish, like constraints, we have got some constraints here on
these row and column numbers, post conditions, preconditions, they all uniformly
use this brace notation, sort of borrowed from hor [phonetic] logic, where
preconditions would come before the arrow here, post conditions would come after,
and then their type invariance, which appear kind of in the middle of class
definition.
But generally any place you see one of these guys that’s some sort of assertion-like
thing. In terms of calling operations, you can either put the object out, the first
parameter can be out front, sort of like a lot of object-oriented languages, or you can
put all the parameters inside.
If you need to identify the operator you can use the type double colon operator, but
in general the name space is automatically searched, the name space of each
parameter type.
So if you’ve got x plus y, it’ll look for a plus operator in the module that defines the
type of X and the module that defines the type of y. So you rarely have to qualify
operators or any kind of operation name. It all pretty much finds it by doing
[inaudible].
And it uses the context to do that so it’s pretty rich in its ability to find the operators.
Yeah?
>>: Back on the queen space. The capital N your type, is that something that’s static
or can you actually instantiate these interfaces on different numeral than runtime?
I’m thinking I want to write a program that lets the user tell me ->> Tucker Taft: This is kind of a weird way to do it. It would make more sense to
pas sit as a parameter in runtime. F you do it here the requirement is that it
ultimately be known, it doesn’t need to be known here obviously because we don’t
know what it is, but when you instantiate a module all the parameters must be
either other parameters of an enclosing module or be literal values.
>>: The usual [inaudible] instantiation?
>> Tucker Taft: Exactly. It’s almost more methodological. If you’re going to prove
things sometimes you want to be able to expand them out. And so it’s sort of saying
well, let’s not make the prover’s life too hard. There’s nothing really in the language
that couldn’t’ deal with these dynamically since most the time you don’t know what
they are.
It’s more just overall there’s kind of a guarantee that if you go back to the very initial
instantiation you’ll know everything, which is sort of nice in some cases but it’s
really more of a methodological thing.
In some ways that’s the difference between the parameters to a module and the
parameters to an operation. Okay, the other thing that’s a little different is that
there are five types that each correspond to each kind of literal. So unlike in C
where you’ve got to say, well, this is a long integer literal or a very long float-integer
or something, there’s just one integer literal and it’s universal, you know? It’s
infinite precision, same thing with real. Universal string is an arbitrary long vector
of universal characters. Universal characters are Unicode, you know, you get clingon and everything else you could imagine. And then the one that’s a little odd here
is universal enumeration type.
It’s not odd if you’re familiar with how things in list work, but essentially we’ve
given them a distinct syntax. So when you want to define enumeration type, you’re
not introducing new literals. The literals are out there. You’re saying that these
literals are legal values of my type, or can be converted to my type.
The way literals work is that to allow to use a literal with a given type, you have to
declare the from_univ operation. And that from_univ operation is a conversion from
one of these universal literal types to your type. And so if you declare one of these
from_univ operators then you get to use those literals.
That’s the rule. And the precondition on that conversion determines what subset of
those literals you can use with this type. So if it were a 32-bit integer, then the
precondition on the from_univ would be it’s got to be in the range minus two to the
31st to plus two to the 31st minus one, or something. If it’s an enumeration type the
precondition is critical, it identifies which enumeration literals can be used with this
type.
So if it’s red-green-blue, then that precondition says that it can only be red, green, or
blue. Does that make sense? And if you want to do conventional 8-bit characters,
well then the character code would have to be in the range for that purpose.
So I talked a little bit about the generalized indexing. Basically all these different
kinds of things can all be seen as either kind of key as value or key goes to I’m here,
kind of like a set. And they’re homogeneous at compile time at least. They might be
conceivably polymorphic at runtime, but from a compiler point of view they all look
the same.
And you want to have things like iterators [phonetic], indexing, slicing, combining,
merging, concatenating and so on. You might like a representation for empty
containers, some way of creating a literal container; they can grow and shrink over
time -- automatic storage management.
So the idea is there is some nice syntax I would like to be able to use with my type.
And the same way that if you define that from_univ operator you get literals, if you
define a certain set of particular operators like indexing and the empty bracket and
or equals and a few others, then you can use the nice syntax.
And here’s an example of the syntax container sub indexes for indexing, containers
sub a dot dot b for slicing, empty brackets is for empty container. You can do literals
both positional and named literals and so on.
And these just are syntactic sugar for a call on one or more operators. Building up a
thing like this it calls the empty container operator then it calls something to add
more elements into the container.
The compiler generates those calls for you and then if it finds a bunch of calls, which
depend only on values that are known at compile time, then it essentially evaluates
them at compile time and replaces it with a reference to that constant.
So it does make these operations to implement things like containers, but if it’s one,
two, three it will actually do the computation at compile time to evaluate that
aggregate. Okay. So the whole point of a lot of this was to make it possible to have
pervasive parallelism, and I’ve already talked about why.
The underlying model is there are thousands of threads, it’s too many to worry
about individually, so it’s were just going to use them like water, use them like
registers, use them like virtual memory. They’re just a resource; they’re not
something that you should be getting too hung up about individually.
Your goal is to use as many as possible and the language should make that easy, and
it should also prevent some of these nasty situations automatically. I think if you’re
familiar with how register allocation works, it’s actually a pretty similar problem.
You’re trying to avoid, you know, you’re trying to use cores in the same way you’re
trying to use registers, trying to avoid stepping on using the same register for two
different variables that have overlapping lives and things like that.
Parallel by default, syntactically F of X plus G of Y -- Tennis! Well, I’m missing my
tennis date, how bad is that?
So F of X plus G of Y is evaluated in parallel, or at least it can be. The compiler
decides whether to, but it doesn’t do any work, but sees F of X plus G of Y, it will say,
well, that can be evaluated in parallel.
In fact, you can’t write something that cannot be evaluated in parallel that would be
illegal. You can’t say F of X plus G of X, if F of X and G of X can both update, or if
either one of them can update X, that would be illegal in ParaSail.
So everything you’re allowed to write can be evaluated in parallel. The semicolon
kind of is a separation between statements. If you really want to say I want to do all
these things in parallel, you can put double vertical bars, and then you’re limited to
the same rules that you can’t write that unless they can be safely evaluated in
parallel.
If you can use a semicolon, then that says, well, if you can do it in parallel please do,
if you can’t, then don’t complain. And then there’s another operator, which says
then, process x, then process y, then process z, and that says don’t do them in
parallel. You may think it’s possible but don’t do it. Of course the compiler can
always say, well, I’m so smart I can do it anyway, but that’s not recommended.
>>: So it’s up to the programmer to still sort of figure out sort of, I mean, there’s this
whole work span in sort of communication purposes for [inaudible]. Are you
leaving that up to the [inaudible]? If you [inaudible] parallelism then you’re going to
be slower than sequential, I mean, that’s just -Because there’s a lot of experienced people taking pure functional languages where
you have amazing amounts of parallelism, but the overheads really kill you and it’s
still very hard to sort of do the right [inaudible].
>> Tucker Taft: Yeah. I sort of thought my time was more time, sorry. Yes, there
are. It’s clearly tuning involved in coming up with the right granularity there, and
right now its heuristic is very simple. It must involve at least one out of line call if
it’s going to, you know, if it’s F of X plus G of Y and those are both out of line calls,
then it will create Pico threads.
But these are not actually run on different CPUs, they’re simply candidates for being
run. And it usually works stealing model, if you’re familiar with that model, and so
whether that’s too small, that’s a good question.
The intent is to make it such low overhead, these threads don’t have any context,
and they’re really minimal overhead. But you know, you’re not going to do X plus Y,
that’s the idea.
So I should probably wrap up here so I don’t blow out completely. Somehow I was
thinking I was going for five o’clock, but 4:30 is where I was headed. So there’s a lot
more to say, but you know there’s always another time to say it. But I’m happy to
answer more questions if you have some at this point.
Okay, thank you.
[applause]
Download