>> John Feo: Okay. So today we have... parallel programming, particularly parallel patterns. Tim describes himself as...

advertisement
>> John Feo: Okay. So today we have Tim Mattson from Intel to talk to us about
parallel programming, particularly parallel patterns. Tim describes himself as an
old applications programmer. I've known Tim for I don't know, at least ten, 15
years and that must make me older, but at any rate he's worked on a wide variety
of machines going all the way back to the cosmic cube and the Cray 2 and the
vector machines and the parallel platforms and a variety of languages, everything
from Linda and Strand and HPF and MPI. So lots of experience. And I think
that's the best way to really, you know, understand why parallel programming is
difficult and different alternatives.
Currently Tim is working at Intel since 1993 and he's their evangelist for parallel
computing. And his current interest is in design of pattern languages for parallel
programming which he'll talk about today.
>> Tim Mattson: Okay. Thank you. So well, thank you -- first off thank you for
inviting me to come here and talk to you folks. I'm just based down the road in
DuPont, so it's fairly trivial for me to just bounce on down here and talk to people.
So it's funny that I almost never get here. But hopefully that can change over
time.
I suspect this might be a little bit different than many talks you get in that this is
an idea's talk. I'm throwing out some ideas. I have some firm beliefs on what we
need to do to solve the parallel programming problem and some ideas on how
we can get there. But we're not there yet, and so I'm not going to have a specific
programming language and specific technologies. But hopefully I can lay out a
picture for some of the things we need to do and should be doing and my hope is
is that at least one or two of you out there at the end of this talk will go, yeah,
yeah, let's work together to make this happen faster. And so I'll have more about
that in a second.
But I am from Intel and therefore I always have to start with a disclaimer. These
are my own views, they are not Intel's views. I did not have Intel's lawyers go
over this talk in advance, so if I offend you or make any of you upset, blame me,
don't turn me in to Intel, please. Also, I work in a research group. There's
absolutely nothing I will say that has anything whatsoever to do with an Intel
product. I just love that. And I have tried my best to keep this talk completely
free of Intel IP. So I think I'm safe.
So let's go ahead and get started here. The mini core vision, you know, a year
ago I had to have three or four slides introducing this. I don't anymore. We all
buy in. Moore's law is going strong and you know it amazes me that we're
already demonstrating samples at 32 nanometers so we are really stick to go this
schedule.
And to really understand what this means is by 2017 we should be making it to
the eight nanometer process technology, which is mind boggling even to
contemplate, and that means 32 billion transistors will be the integration capacity.
And what this all comes down to from Intel's point of view is what the heck are we
going to do with all those transistors. Because the day people stop valuing our
transistors densities our business model falls apart, the business model in the
industry falls apart all hell breaks loose and life becomes miserable. So we've
got to figure out how to keep those transistors being at high value. And you all
know that means mini core.
And I want to emphasize I'm trying to stop getting -- to get people to stop thinking
about multi core which is where you take SMP technology and put on it a single
chip. You think mini core where you have general purpose coarse, special
purpose cores and an interconnection fabric that ties them all together. It is
crystal clear that this is where we're going. And therefore from a software point
of view, from a hardware point of view, this is the direction we need to start
thinking.
So how many cores in mini core? Well, you know dual core 06, quad core 07. I
had the unique privilege of being responsible for the software team on this 80
core chip so you know, we're already in the dozens and hundreds -- you know,
it's possible to build dozens of cores. How many cores we ship of course
depends on what the market's ready for, and of course the market is not ready
for a beast like this, but we already have good idea on how to build them and are
continuing research on how to figure out how to make them really practical.
But this is not a new territory for Intel. We've been doing hundreds to even
thousands of cores, you know, spread out over many machines. We've been
doing it for a long time. And I want to emphasize this because I think some of
you, especially you younger folks, I see some gray haired folks out this who, you
know, who have be at this for a while and have a real understanding that this is
really, really old. But I think some folks especially new to the field don't
appreciate just how old it is.
Intel shipped a commercial hyper tube in 1985. I think N cube just beat us to the
claiming to be the first commercial hyper cube. We were right after them. So
1985 is a long time ago. And you know, we built some of the first gigaflop
machines, eventually we built and for some reason my picture doesn't show, oh,
man, there it is, why didn't it show the first time, you know, eventually by '96, '97,
we built the world's first teraflop machine with over 9,000 processors. So, you
know, we know the hundreds to thousands of cores range. We know what it
takes from a hardware point of view, from a system operating system point of
view, from a parallel programming point of view. This is old territory to us.
And of course what it got us was membership in the dead architecture society.
And I really think it's important for people to keep coming back and looking at
pictures like this, because it's a been there, done that before, you know, in the
late '80s and '90s everybody was had to go parallel, it was the big band wagon,
we all jumped in, and I recognize some of you probably were attached to some of
those companies that failed because as you notice, there aren't very many
arrows continuing to modern time and the companies with arrows continuing to
modern times like this one and SGI don't look anything at all like they did back in
the heyday. I think we can only conclude that parallel computing is toxic and are
we utterly nuts to be going back into the space right now.
And I think if we don't spend some time in asking ourselves why did it fail in the
past, if we don't understand that, we have no chance of getting it right this time
around. And I really would like to get it right. So I think clearly what went wrong
-- there we go -- was the software. You know, we figured out how to build
wonderful hardware back in the heyday of massively parallel computers, but the
software never showed up.
Now, in the national labs where you had folks that could spend and do whatever
it took to get the software, they had all the software they needed, but we firmly
believed back in the late '80s and '90s that every engineering company from the
oil industry to the automotive industry to the pharmaceutical industry, that they
would all need and have a tremendous appetite for these systems. But the
software never showed up.
Now, so obviously what we have to do is think about this as a software problem.
It's not a hardware problem. We can build all sorts of weird hardware but the
software can't take advantage of it then it's useless. Can we generate the
software automatically, parallel software automatically? It amazes me that even
today I hear people talking about how we can have implicit parallelism where you
describe the algorithm declaratively at high level and then some magical tool will
turn it into a parallel program. Look. We know that doesn't work. We spend
decades of research trying to make that work. Why do we possibly think it would
work this time around, I don't understand.
It's a valuable research approach, I'll get that second, it's a valuable research
approach. I like people to keep doing research and thinking about it, but I'm not
going to bet the future of my company on someone figuring out how to make this
magic work where I can express my program and can automatically discover the
concurrency. So I just don't think that's a line that I see being productive.
>>: [inaudible] I wonder, though, is it really [inaudible] being able to write the
software? What about the data? I mean, you don't have the data to feed this
monster, then all this [inaudible] were they really at the point where they knew,
you know, how to represent their data, put it in a form that they could really
compute over it? I mean, it seems like the big difference between now and then
is we have a heck of a lot more data.
>> Tim Mattson: Well, I ->>: We also have more standard ways for talking about it.
>> Tim Mattson: So your question is, was the issue really a lack of parallel
software, was it the inability of handling the problem at the data level and having
the data that could take advantage of it? And I guess I have to -- I mean, I think I
understand where you're going with that. But in several industry sectors where
we tried really hard to get penetration to parallel computing and failed, and they
had plenty of data, so I'd look in the -- in several of the engineering disciplines,
and in the pharmaceutical sector where I did a lot of work. The data wasn't the
holdup. They had the data, they understood how to format the data, they had the
big problems with the big data sets. Moving it around was hard. I mean, the old
saying was an old MPP was like feeding a super tanker with a drinking straw.
So there were those bandwidth issues. But in terms of did they have the data,
did they have the problems, did they understand the data and how to address
them in these large parallel machines, that I don't think was the issue. Yeah?
>>: I mean, I really want to come back was it really software or was it just not
cost effective at that time? Because you industry after industry you could say oh,
yeah, it seemed like you know they should be doing this but maybe the failure
was you guys just didn't make it cost effective.
>> Tim Mattson: No, I -- go ahead, John.
>> John Feo: I think it's a combination of -- I think the problem is like you were
saying, it was getting the data to the process I think was the difficult thing, okay,
and I think that's where, you know, the machines didn't perform well. And you
could say well we didn't have the right tools in order to sort of collate the data and
the operations or maybe we didn't have the right hardware, which made it difficult
to move the data to the processors. So I think it's that -- the data movement, the
data layout problem which was A, difficult to do, and, B, caused the performance
and scaleability to be so poor.
>> Tim Mattson: Yeah. Respond to that second ->>: How much of that was they couldn't do it and how much was simply that you
could do it cheaper on commodity hardware [inaudible].
>> Tim Mattson: No, they're not commodities. Okay. Wait a minute. Wait, wait,
wait.
>>: No, but ->> Tim Mattson: A commodity is defined as a product that there's no distinction
on what source you buy it from other than source and a microprocessor is not a
commodity. Though people commonly call it that, but there are really differences
between an AMD and an Intel. At any rate I'll get that out next. [laughter] what
bugs me is when people at Intel call them commodities. And it's just like no, I
know the market calls them that, but it's very important to us businesswise that
people don't think of them as a commodity.
Okay. So several great ideas here. You know, obviously I'm oversimplifying. So
yes, there were problems with the data issues made these machines impractical.
Yes, part of the problem was that the -- that Moore's law was just, you know, the
microprocessors coming out. Once we hit our stride with the pentium pro in the
mid '90s and then that follow-on, so finally we had a processor that really could
do floating point well and fast, that created a pressure. So that was a factor.
But let me respond to the software problem. So because my role at Intel has
always been the applications person who can speak the language of the
scientists and who can speak the language of the computer and software
architects. And in the pharmaceutical sector where I work very, very closely,
they had a whole slew of programs they used all the time. And to the end user
medicinal chemist, you know, they had their productivity software that they were
using in their domain, their molecular dynamics, their quantum chemistry, their
process control flow applications and pieces of them would run on these
machines but you wouldn't get the whole thing running. So it killed the
conversation about adopting parallelism before it really even got to issues of data
movement and is it competitive with other opportunities.
You know, the fact was the software wasn't running on the parallel machines, so
it made it really hard to go anywhere with these conversations. So all of these
factors were important. But we couldn't even get to the point of let's work
through those factors because the software wasn't there to start the
conversation. So it's -- any rate, the software was a huge -- was a huge, huge
part of it.
All right. So it's a little bit controversial in some circles, especially the younger
folk are trying to really push this thing of just a magical tool will solve the
problem. But we know better now. I mean, we've got years and years of
research. Our only hope is to get programmers to write parallel software by
hand. People have to create parallel software and at some level they have to do
it explicitly. They have to describe the concurrency and they have to describe
something about the way that concurrency is going to be managed.
So the solution of course was let's create a new parallel programming language.
And this is a slide I just love. I went through the literature and tried to gather
together a lot of the important languages you could find in the mid '90s. And to
give you an idea of what parallel programming nerd I am, these are the ones that
I've at some point written software with. So -- I mean it's just ridiculous how
many.
Now, what I want to assert and what I want to warn you all about is this was
really, really bad. This was a bad thing. And I want to convince you of that. And
ask the question, did this glut of parallel programming languages help us or hurt
us? And I want to describe what I consider to be one of the most important
computer science papers of the last few decades.
This is a Draeger grocery store study. Do people know -- have people seen this
study before? Okay. Good. This is really important. And since Microsoft is
such a technology engine, I want you to really internalize what this study told us.
So this was done by at Stanford and it was in their -- it was a marketing research
group at Stanford. I forget what department they're in, but what they did is
Draeger, the Draeger grocery store any guess it's Palo Alto or in that
neighborhood is one of these gourmet specialty grocery stores that has just the
really high end specialty items. And they set up two displays of gourmet jams.
One display had 24 jars, the other display had six jars. And the idea was that a
customer could come up, sample the jam and get a coupon for a dollar off.
So if you think about this, what they could do is they could count how many
people walked into the store, how many people sampled the jam and how many
people eventually purchased. So they could easily gather that data. So then
what they would do is they would randomly swap the displays and they would
train graduate slaves to stand there and to properly you know present the
product so they could see how these two different displays worked.
Now, what they found is that the 24 jar display indeed attracted more people. 60
percent of the people who walked in the store walked up and would sample at
the 24 jar display, whereas they only got 40 percent of the people with the six jar
display. What is really critical and what completely shocked them when they did
this study is that 30 percent of the people who sampled actually purchased with
the six jar display and only three percent purchased with the 24 jar display.
Now, this has been confirmed time and again. They've went on and they did
other studies. So this, you know, one could argue that well, this is just jam, it's
frivolous. But it's been confirmed with 401 K plans at businesses. They would
look at companies with this huge complex array of 401 K plans that you could
voluntarily contribute to or companies that had three or four and they found the
company with three or four actually had higher participation rates. So this has
been confirmed with things that matter a lot more than jam.
And they coined the term that what's going on here is choice overload, that
there's a natural human tendency when you present a human being with too
many choices they'll just walk away. It's like it's too hard to make a choice now,
I'm going to go away. So now keeping this in mind, what do you think the
response was to this?
>>: [inaudible].
>> Tim Mattson: I really think that back in the '90s we hurt ourselves pretty
seriously, and I don't want us to do it now. I think it's a good thing that today -now, there's an HPC bias here, I understand, but today really there aren't that
many parallel programming models that are heavily used in HPC. You know,
there's of course the people who do hand threading, WIN 32, API or posits
threads in the UNIX or Linux space. If you want to do compiler directives, there's
open MPI. If you want to do message passing there's MPI and frankly MPI
trumps all the others in the HPC space.
And then there's these new kids on the block, CUDA has really taken off, but
CUDA will rapidly transition to OpenCL because OpenCL is a standard, it does
functionally pretty much the same stuff as CUDA but it's an industry standard so
they're not tied to one vendor. And interesting [inaudible] seems to really be
gathering momentum and it's kind of the exception that proves the rule because it
emerged from an end user community. It wasn't computer scientists or people at
a chip company sitting down saying this is how I think you should program, it was
a telecom community had a problem and they created a language to solve their
problem. And then they rolled it outside. So it's very interesting how that one
evolved.
But the fact is there's not very many today. And since choice overload is a very
really phenomena, what I caution us in the industry on Intel, Microsoft, all of us
out here working on this problem, is new languages for research are great, you
know, please spin off as many languages as you can and keep them inside your
walls and study them and figure out what works and what doesn't work. But
when you're ready to deploy to the outside world, less is more. We can actually
damage ourselves by spinning too many out.
This worries because at Intel right now I'm aware of seven different parallel
language projects, and I shutter to think if we try and turn them all out. I bet you
guys have even a lot more than that. Yeah?
>>: [inaudible] that used to have a department of product prevention.
>> Tim Mattson: Oh, really, deck had a department of product prevention. I love
that. That's good. Okay. So what I would rather see people do, the tendency in
parallel computing historically has been if you see a language you don't like,
make a new one. And I ran into this back when I was at Yale, I couldn't attract
graduate students very well when I was in the faculty at Yale because I want to
do research in usability, I want to do research in how effective different
languages were. And they realized they'd have a tough time when it was time to
get the PhD if all they knew was about how to use languages. No, but if they
created their own language, yeah, yeah, that was considered PhD worthy work.
So I really think we need to take on a different mind set, and we need to think of
how can we fix what we have today before we turn to something all new. So
classic example of this is OpenMP2.5 could not handle the simple pointer
following loop like this. It was very awkward what you'd have to do to deal with
that construct. So rather than throw OpenMP away we amended it in
OpenMP3.0 so like all things in OpenMP, the parallel says give me lots of
threads, the single says the following block between the curly braces only one
thread will do. So this thread goes through and it creates tasks, separate tasks
to process this loop. So now I extended the OpenMP and made it so that I can
handle these kinds of structures which are very common in modern software
engineering.
We fixed the API rather than throw it away and create something new. And I
would really -- you know, this is less sexy than coming up with the new
whiz-bang Tim MP, but I think we really need to think that way of fixing what's out
there instead of creating things that are new.
I might add when I look at language design, I look at C, C++ and Java as all one
family because to me the programmer syntax matters. What you do under the
cover, whether it's a virtual actual machine or a just in time compiler, I could care
less. All I care about is the code I write. And I love the fact that C# looked at that
same tradition and said how do we make it better. I mean that's the kind of
thinking that I would like to see extended to parallel programming. Take what's
out there and accept it and fix it.
But patching up old APIs is not enough. I think that's a critical step, there's more
we have to do to solve this problem. One issue is what I call the parallel
composability problem. Modern software is not written the way old scientific
software was, meaning, you know, in the old days what we used to do is you'd
have a team at some lab and we would program would come in and the team
would sit town and roll up our sleeves and we would parallelize it and tune it.
Now, of course software has written components, they come from many sources,
they come from many languages. So the parallel programming idea, assuming
everything is under one umbrella and one program just falls apart and you have
to be able to compose individual modules together.
I mean a good example of this is the MKL library. This is the math library that
Intel has. Uses OpenMP. And if you had an OpenMP compiler built with -- I
mean an OpenMP program built with the Intel OpenMP compiler calling the MKL
which uses OpenMP you could not fit them together consistently. What I mean is
you would have oversubscription problems, you would have all sorts of problems
because there's no concept in the design of OpenMP or the supporting run times
for how to compose modules. So this is a huge problem.
Now, I think the only sorts of people who can solve it are people like companies
like Microsoft that control their infrastructure. And I know you have a common
concurrent runtime project and I saw a presentation on it and I felt like jumping
up and applauding because that is the only way we're going to solve that is if
things mapped in on to a common runtime because then you can do resource
management and handle the oversubscription problem and you can handle data
locality issues that cut across modules.
So I'm assuming you guys are going to solve the composability problem at least
in your universe and from Intel's point of view, that's good enough, that's pretty
cool. So that's great.
So I don't need to deal with that one. But the other area which is how do we
intelligently evolve parallel programming languages and APIs so that they work
for the mainstream general purpose programmers? That is a second problem.
And that's one where I think standards and working across the community is key.
And this is something that Intel's really good at because you know we're software
neutral and we can bring people together from very different persuasions. And
because of our nature sitting at the heart of the computer but you have to
someone else's GPU often sits there and someone else's vendor's actually
selling the machine, we're actually this a very good position to work on this
problem.
How do we figure out how to intelligently evolve parallel programming
infrastructure? So that's what I'd like to talk about. Now, when I looked at the
history of parallel programming and all those huge slew of languages, you would
think we would have solved this problem by now. But we haven't. And I am
convinced and it takes me a long time to develop why I'm so convinced of this,
and I won't do that now, but I could walk you through it at some point. I'm
convinced a huge part of it is that we did not approach the problem scientifically.
We did not approach the parallel programming scientifically. We approached it
as engineers. If I had a new idea for parallel programming language what did I
do? I found some graduate students and we created it and we wrote some
codes with it, then we published a paper, and we jumped up and down and
padded ourselves on the back, and then we'd go off and we'd do the next one.
And that's very effective if you're building bridges and buildings, it's not effective if
you want to develop a theory for how to approve and evolve a technology. For
that you need some kind of systematic basis for comparing them. Basically
survival of the fittest has been proven to be an incredibly effective mechanism
but you have to have some way of deciding what's the fittest. And that means
you have to have a way to measure and assess and talk about parallel
programming languages and compare them. I think that's one of the
fundamental things we need to do. And that's one area I would love to work with
you folks on.
So how do we compare parallel programming languages today? And I'm not
going to pick on a friend of mine at Intel because he has a slide like this he used
to carry around a great deal. And I'll leave his name anonymous though some of
you may know who he is just from this slide. But he'll talk about his super cool
whiz bang nested parallel data language and look how superior this is to
OpenMP and man that says absolutely don't you really want to use that nested
data parallel? But what does this really tell us? It doesn't tell us anything. He
doesn't explain why the OpenMP code was written that way. He doesn't tell us
any performance data. He doesn't give us anything. All this tells us is that he
prefers nested data parallelism to OpenMP.
So okay, we get that. I don't know how valuable that is, though. So frankly I
think my colleague should be absolutely shamed for every using a slide like this.
Shame on him. Shame on me. This is a slide I often use in my OpenMP talks.
Look how nice and elegant this OpenMP is, but gosh if I did it with the within 32
threads API, oh, my God, you don't want to write this crap. No way. This is
much better.
So the point I'm trying to make is look we're all guilty of this. We're all guilty of
engineering by pot shots. And that's got to stop if we want to evolve the state of
the art then we have to support the state of the art with systematic and careful
sense of comparisons.
So I'd like us to do it right this time around. I really feel that we're on the cusp of
a second chance. We blew it in the '80s and '90s, we meaning those of us doing
research in parallel software. We created some great stuff like sizal [phonetic]
and Linda and Strand and wonderful languages that nobody used. This time
around I'd like us to secrete stuff that actually works and solves problems and
people really use. So I'm convinced that what we need to do is create a
systematic scientific method for working on software. In other words, I can put
out a hypothesis, I can test it, I can peer review it, I can understand what worked
and didn't work so then I can feed that back in and over time evolve
systematically to a parallel programming technology that will really work. That
procedure has never been done consistently in parallel programming. And that's
what we have to do.
And I break it down into these three steps. We first have to develop a way to talk
about how people solve parallel programs. We have to develop that human
angle. What are the ways people attack a parallel program, what are the
patterns they use in building a solution, we have to create a way to understand
that and talk about it. Then we need to come up with a jargon, a terminology for
how we talk about programmability. So that when I'm sitting in a room of experts,
I don't say gosh, F sharp is just a nightmare, I couldn't imagine using that. Oh,
yeah, well OpenMP sucks. I mean you got to get to the point where you can
really talk about what's good about something, what's bad about something, how
the tradeoffs are made.
So we have to create that terminology for talking about programmability. And I'm
going to close just briefly with the plea that we have to come up with metrics and
programmability metrics. So let me go ahead and walk through this.
Now, we found in the early '80s when -- early '80s, early '90s. Gosh. You know
you're getting old when you get off a decade with things. In the early '90s some
of you older people here will remember the emergence of object oriented
programming. I mean the technology was really, really old but it took off in the
late '80s and early '90s. And in the early '90s I would characterize the object
oriented programming world as utter chaos. It was chaos because we had these
different languages out there, which isn't bad, small talk, objective C, this weird
thing called C++, but there was really no idea how to really use them. So you
had these horribly expensive projects, the one at Mentor Graphics is the one that
-- because they're right next door to where I was living then, that I was most
aware of, where they spent millions and millions of dollars with teams of
engineers redoing all their software in C++ and then it didn't work, they could
never get it working right and once it did, it didn't run very fast. Because nobody
understood how to really object oriented programming.
And what you had to do was write a whole bunch of programs and fail many
times and then eventually you'd get it and you could do object oriented
programming and have a lot of success. So a book came out in the mid '90s,
early to mid '90s on design patterns. And this is a famous book. I have a feeling
if I went around to your cubicles most of you would have this book on your
bookshelf. It's the gang of four book on design patterns. And it also overnight
got the object oriented programming field to grow up. Because now all of a
sudden if you were new to the field, you could read this book and you could know
the tricks of the trades that the experts took for granted. And that was really
valuable if you're trying to learn object oriented programming.
It was also really valuable because if you had a room of experts sitting around,
they could say, gee, what do you think we ought to use here, well, maybe the
factory method. Oh, yeah, okay. So you have a jargon that the experts could
use when talking about their field. So really it was amazing to witness it as a
software engineer out there earning my living writing software how almost
overnight object oriented programming went from chaos jungle cowboy land to
systematic engineering field.
Now, a design pattern is a solution to a problem, a recurring problem that
captures the excellence in the solution to that problem, so you mind patterns
from excellent solutions. It's not necessarily anything brand new, it's just
codifying what the experts have worked out and make it putting a name on it,
writing it down a standard way. Now, this book was a catalog of patterns. If you
go back to the origins of patterns, they talk a lot about a pattern language. And I
want to emphasize the importance of a pattern language. A pattern language
has the idea of a flow through the language. Patterns interlock, they fit together.
I have high level patterns that lay out the solution at a high level lower level
patterns, they hierarchically nest and you flow from high level down to low level
patterns as you go from the start of a problem to the end. So the catalog is as I
said, you just look up and like that's my pattern.
A pattern language is I use this pattern which then flows me to this pattern which
then flows me to this pattern. So pattern language includes this higher level
knowledge on methodology that's missing in this book.
So what I did back in the starting in late '90s and it was published in 2004, is I
attempted to do for parallel computing what the gang of four did for object
oriented programming and I wrote this book with Beverly Saunders who is vastly
smarter than me and comes from the formal verification community. So she -- I
want to do things fast, she wants to do them right. So we actually were a great
team.
And then Berna Masingale, who I met when she was a post-doc at Caltech, an
old applied mathematician -- well, she's not old, but an applied mathematician
who does lots of work in parallel programming. So the three of us wrote this
book. And it is a full pattern language. And one of the fun things about writing it
is the examples are in MPI, OpenMP and Java. So it was really fun to take a
pattern and see how you do it in these different languages.
However, the three of us basically come from a scientific programming
background, so if you read this book today, you would see that it does a really
good job at capturing how the whole HPC era of parallel computing did things so.
In some regards it's a little bit narrow on scientific computing even though that
wasn't our goal. And we've learned a lot since we wrote this book.
So what I'm doing right now at the ParLab at UC Berkeley is to do the next
generation, to kind of throw this away and let's do it right this time but pay a lot
more attention to bringing in a broader community of programmers. So what
comes out at the end isn't for scientific computing, it's for all computing. It's for
general purpose computing.
Now, is everyone here familiar with the ParLab at UC Berkeley? Some yes,
some blank stares. Let me show you the one picture about the ParLab. So the
ParLab, this is the group at Berkeley that Intel and Microsoft are funding through
the UPCRC, which I can never keep all the letters straight. But we are together
funding this company to the tune of a big pile of millions of dollars. All right? So
you have a whole team of very smart people who are really excited about making
sure that Microsoft and Intel are happy so five years from now maybe we'll give
them more money.
So I mention that, though, because I have found to a remarkable degree they're
very, very open to someone like me from industry coming in and saying gee,
have you thought about doing it this way, and they've changed, they've made
huge changes in how they do things. So I'm really pleased to the extent at which
they're looking for input. We're not just money bags, they really want us to be
direct collaborators. And I am very closely and directly collaborating with them.
So the way they're working at things is they have at the top a collection of
applications that are driving the research. The problem is to bring -- that they're
addressing is to bring the world to a point where you write correct software that
runs efficiently on mini core and they like to say that they're addressing the single
is socket problem. So they're not trying to address the HPC problems and the
cluster computing problems. These are all very interesting, but let's just focus on
the parallelism that sits in a single socket.
So how do we make that software routinely parallel that's correct and efficient?
So they have these applications at the top, image, retrieval, hearing and music,
speech, et cetera, and what we're doing is we are analyzing those applications
and we're going to create an understanding of how humans think about them and
how humans parallelize them in terms of a design pattern language.
>>: [inaudible] a bunch of cores on a single chip, is that what you mean by single
socket?
>> Tim Mattson: That's what I mean by single socket, yes, so lot and lots of
cores on a single chip. Which -- I mean I gave you the word they used but it's not
quite right, it's almost better to call it a single board, because they're also really
into you got a GPU sitting there, let's use that, too.
But the thing is a lot of these people have relationships up the hill at the labs
where everything is huge HPC, so they need a simple slogan so when the HPC
come to them they can calm them down and say no, no, no, we really mean your
laptop. Your laptop. Your handheld device, the single platform.
>>: [inaudible] you have shared memory or not, right? [inaudible].
>> Tim Mattson: Well, maybe. But I don't submit you have to have shared
memory. It's not really -- I mean, yes, today, the single socket you tend to have
shared memory between all the cores but nothing says it has to be that way. So
my research at Intel is doing distributed memory cluster on a chip and I'm finding
that's an incredibly productive architecture both from a software and a hardware
point of view.
>>: So that would say that distributed memory is as hard as HPC?
>> Tim Mattson: Yes.
>>: [inaudible].
>> Tim Mattson: This problem is definitely as hard at HPC. In many ways it's
harder. And it's harder for sociological reasons. At HPC I have programmers
who actually enjoy, enjoy the difficulty. I mean, you just have to visit one of the
national labs and sit around with the programmers and you'll hear them go like
yeah, man, it was great, I was up till two a.m. getting that routine debugged and I
got it from 97 percent to 97.5 percent speedup, it was just great. They like that
stuff. But here, you have really world programmers who have deadlines to hit
and they have to add features and ship on time and it's -- so the sociology is
totally different that these people will not take the pain as the HPC.
>>: The memory may be shared but on the cache is shared because you don't
want the cache all over the place.
>> Tim Mattson: So that's a separate issue. And from a hardware point of view
that is the crux of the issues we're working on right now at Intel. So a lot of the -I mean, the Intel projects today -- Intel products today are shared cache.
>>: [inaudible].
>> Tim Mattson: Not the L1s. They have their own L1 but they have a last level
cache they share, okay. So you have L1, sometimes even an L2 that's per tile.
That 80 core chip that I worked on was shared nothing. There was -- there was
no sharing at all. I have a research project going on right now that we haven't
released in public so I can't say much about, but it's kind of a hybrid between it.
There's some sharing, there's not some sharing. So I mean we're basically
exploring the design space very broadly, which is why I responded to your
comment about well it means shared memory. Well, it doesn't have to mean
shared memory. And when you look at the overheads at managing a shared
cache, you quickly get to the point where you're eating up that bandwidth on that
on-chip network just keeping track of who has which piece of the cache. And at
some point you have to wonder, wouldn't we be better off to just get rid of the
shared cache?
Now, I know I'm getting off on a tangent, but I think this is an important tangent
for people thinking about how to do parallel programming. I have found
anecdotally when I sit down with old timers like me who have done tons of
shared address space programming and message passing programming, we
prefer the message passing than shared address space. Why? Because in
message passing, any time sharing occurs it's explicit in the text of the program.
I know sharing's occurred, and if I don't have that send and receive or
communication routine then I know I have isolation. Whereas in shared address
space programming it's the opposite extreme. There may be sharing going on
that I have no clue about and there's nothing in the text of the program to tell me.
So I think there's this illusion out there among us computer scientists pushing
parallel programming that shared address spaces are a good thing, and I don't
believe it. I think in fact they may be really bad thing and we may need to really
move away from them. So shared memory may be okay, but I like the idea of a
partitioned global address space. There's an address space you can get to the
memory it's shared but it is apparent in the text of the program when sharing
occurs so there's a discipline on how that sharing occurs.
So -- and keep in mind as I say that, I'm Mr. OpenMP, you know, I've spent the
last ten years working in OpenMP in a shared address space. So in some
regards it hurts me to say this and my OpenMP people throw things at me when I
say this, but I really do think some day we will look at this whole shared address
space trend that started back in the '80s and '90s and just think it was a big
mistake and throw it away. So I don't necessarily think it will be shared memory.
At any rate -- yeah?
>>: [inaudible] a shared name space has been the foundation of traditional
sequential composition on which we've built layered software. If we throw that
out, are we throwing out all our notions of modularity and layering that went with
it?
>> Tim Mattson: No, no, I don't think so at all.
>>: How about all the current software that's built on top of those notions and
layering?
>> Tim Mattson: Well, that becomes very difficult then, doesn't it? Yeah.
>>: [inaudible] he's the invited speaker. [laughter]. So we'll let him say thinking
that we can.
>>: I know but in [inaudible] caching memory there's some ramifications.
>>: I'm keeping quiet.
>> Tim Mattson: There are huge ramifications. At any rate at Berkeley then you
have the applications at the top driving things, we have a design pattern
language that captures the idea of what those applications are doing, what the
concurrency is, what the parallel algorithms are, then you have a lower layer that
they call the productivity layer. And the idea is that you have a large universe of
programmers who are focused on just getting the job done, they're usually closer
to a domain of expertise, not hard core computer scientist, definitely not hard
core concurrency experts. So they're living up at this range here in the
productivity layer.
And the hope is that they can do a lot of what they need through high level
languages, maybe declarative languages, parallel libraries, high level
frameworks, you know, this is where you really need to raise the level of
abstraction and make it easy for the general purpose programmer.
Then there's a smaller number of programmers who work at this efficiency layer.
They're the ones who I would -- these are the traditional HPC people. You know,
the advantage we have in the HPC world is everybody was in the efficiency layer.
So we didn't even have to worry about the productivity layers because we didn't
care to be productive, we just wanted to be efficient. So you know, MPI low level
languages, threads, they worry about how do you get the performance up on the
library so auto tuners to search parameter spaces. So these are the people at
the efficiency layer who worry about getting every last ounce of performance.
The productivity layer, they'll trade off performance for the ability to quickly
engineer software.
>>: I wouldn't say that the HPC people don't care about this, they care greatly
about the [inaudible] but they'll always throw it overboard for an extra [inaudible].
>> Tim Mattson: You're right. The HPC people -- I know, I'm being slightly
facetious. They do care about productivity, but they won't take -- I mean they'll
say things like I need a new programming language, it's got to be easier to right
software, but then when you give it to them they throw it away because it cost
them a few percent. So I often say that the HPC people are just a bunch of
whiners and that they should just shut up and sit down with MPI and leave us
alone, because when all's said and done you just give them MPI and they're off
and running and they get their jobs done. And they do beautiful work.
So I'm frustrated by them, but on the other hand that's kind of my people. I
understand them. Yes?
>>: The two layer seem to pretty much all the database people who have the
database manager and the database user, how well does this computer?
>> Tim Mattson: I have no idea how this compares to the way the database
people break things down because I'm not a database person.
>>: You're not.
>> Tim Mattson: So I would imagine that if I looked at different slices of
computing that have a similar breakdown. And in fact, I was at an OpenCL
meeting recently and one of the interesting the things for me about OpenCL is we
have participation from game vendors. So I actually sit down and talk to really
gaming software developers. I never talk to these people. And they were talking
about a technology programmer that a very key thing in them producing a game
is that the time from a technology programmer's really, really expensive. Well,
when I probed them on what they meant is the technology programmers what
they call the efficiently layer guys and it's about one percent of their programming
staff, and they pay them a lot more, so anyone looking for job security, you know,
learn how to be in this layer. They get a lot more money and they do everything
in their power to minimize the amount of time they have to go to those technology
programmers. The bulk of their software developers are up in this productivity
layer. And that surprised me because inside Intel we'd always been told that,
you know, well, game developers, they've got to hit those realtime constraints,
they're very performance centric. But actually increasing their moving to not just
C++ but scripting languages, Pearl and Python and it's really bizarre.
But at any rate, they too were talking about this split. But the Berkeley crowd we
think 90 percent are productivity, 10 percent are efficiency. In this gaming
community they were saying 99 percent were productivity and one percent were
efficiency. So it seems things are naturally migrating to the split. And I've talked
to people in the business world and they've similarly said there's this split
somewhere between 90-10, 95-5 percent between productivity and efficiency
layer.
>>: Do you mean [inaudible] financial sector.
>> Tim Mattson: Not the financial sector because a lot of that looks like HPC.
I'm talking about business process, sales resource management, point of sale
management and you know, another area I know as little about as I can get away
with.
>>: [inaudible] and that type of stuff. You know, their model is right many times
on [inaudible] so they do really want productivity [inaudible]. Same thing with
[inaudible] always changing [inaudible].
>>: In the financial sector?
>>: Well, in the [inaudible] analysis.
>> Tim Mattson: So at any rate, finishing out this overview of the Berkeley folks
you have groups working on applications, you have groups work on patterns, you
have groups working on the productivity layer, you have folks working the
efficiency layer, you have a group I just want to mention it briefly because I'm
really interested that what they're doing is rethinking the role of the operating
system and saying like well what if I expose the scheduler as a first class item
that the efficiency layer people can manage? Because you know, most of the
time the schedulers bury down on the OS and I have very little way to interact
with that. So it's very, very interesting how they're blurring this line here between
operating system and the efficiency layer and it's one of these areas where I'm
not participating in the research but I'm watching it very, very closely because it's
fascinating to me.
And some of the grad students working on this are graduating soon. So if you're
looking to recruit some really interesting people that's a good place to look. Then
they have an architecture group. So they're really top to bottom is what they're
doing here in the ParLab. And then we haven't heard much from them, but the
whole idea is off to the side, they're the people that are going to worry about
correctness, but I'm an old HPC person, I don't care about correctness. So
[laughter]. It doesn't have to be correct, it just has to be fast.
So what we're doing in this work I'm now describing is work I'm doing with
Professor Curt Quatzer [phonetic] and I don't know if many of you know him, but
he's very well known in the CAD community and spent like I don't know, 15, 20
years in the CAD business before he became a professor. But what we're doing
is we realized that the problem we're trying to solve is bigger than just parallel
algorithms which is really what my book focused on, is just how do people do
parallel algorithms? What we really have to do if we're going to address the
ParLab mission is we have to figure out the whole software architecture angle,
how that looks from a parallel point of view. And so our influences are, you
know, the traditional patterns community and my book influencing in one
direction, so a very algorithm focused thought of patterns, the Berkeley folks
broke down computing into these 13 dwarves which political correctness doesn't
let them call them dwarves anymore, so I think they're calling them motifs, but
basically the 13 patterns are common computational elements that seem to
appear again and again across domains. And then we looked at Garland and
Shaw, who kind of had that classic book that summarized software architecture
to high level. And then finally the influence that really came in strong was the
very original work by Christopher Alexander on pattern languages. And this is
where it all started back in the '70s.
And Christopher Alexander, I don't know if you folks -- it's actually a beautiful
book to read and it's actually very interesting to read. Because what he was
trying to capture is he talks about that quality within a name. When you see
something really, really special. Something inside of you can tell that it's special.
And wouldn't it be nice if you could get experts together who understand that
specialness and if they could write it down and share it. So it's very much that
almost religious zeal for patterns and pattern languages. So you pull all of these
together, and you end up with our pattern language. Okay, clearly that's a
working title. But some day we'll give it a better name. But OPL stands for our
pattern language. And in a very long talk, you know, I've given patterns talks
many times and they're really hard to make them interesting, because they get
down into a lot of detail, and they quickly get kind of boring. So the thing to do is
just look at it at a high level and then come to me and I'll give you more detail
later, I'll give you things to read.
But the idea is that we're trying the to capture a language for the architecture of
parallel application software. We start at the top here, in fact let me advance to
the next, we start at the top here with patterns that talk about the large scale
architecture of your application. So pipe and filter, map reduce, layered systems.
These are concepts if you know Garland and Shaw and have been reading
software architecture academically that will be very familiar to you. Then over
here we have the computational patterns. This is the 13 dwarves, you know, the
idea is that if you're familiar with the 13 dwarves it started with Phil Koala who
recognized and suggested that all of scientific computing was an instance of
seven archetypes that he called the seven dwarves. There's things like dense
linear algebra, sparse linear algebra, unstructured, structured mesh, et cetera.
And what the Berkeley folks did is they sat down and they took that as their
starting point and said, well, you know, if we add six more, we think we can cover
everything. So they came -- that's where the 13 dwarves came in. And then they
would invite people in from different application domains to sit down with them
and go through the dwarves to try and do a -- you know, see if they had
coverage. So they would bring game developers in, they brought the folks I work
with in the workloads group at Intel in to look at the RMS workloads, they brought
in financial people. So they really spent some time validating to themselves at
least that these 13 dwarves do a pretty good job of covering everything. And you
know, it's not cast in stone. In fact, we're very interested if someone looks at the
list and says, gee, you need to add two or three more. The list will grow. The
key thing is that it's a number on the order of 13, it's not a number on the order of
10,000 or hundreds. It's a manageable number. And these are the
computational patterns. So the way I like to think of it is when I'm at the
architectural level, if I'm walking up to a white board and I'm describing my
application to you, at the architectural level I'm drawing boxes and arcs between
the boxes. Then I'm ready for the computational level, now I'm saying what's
happening in each box. And so therefore it's an iterative nature, so I may as I
start looking at what's happening in each box say gee, those two boxes really
should be merged. Or maybe I should split this box apart another way.
So we have these arrows here, these green arrows pointing out that it's kind of a
back and forth between the architectural level and the computational level until
you've described the architecture of your application in terms of the mainly
boxes, the connections between them and the types of computations occurring in
each box.
>>: Have you alluded to the notion of composition and how that is [inaudible].
>> Tim Mattson: Yes.
>>: Future parallel programming? Would you say that that's mostly a concern of
your left box, the architecture box?
>> Tim Mattson: Is composition mostly a concern of the architecture box? I don't
think so. But I'm not sure. I think it can't be just a concern of that box. And the
reason I think it's hierarchical in composition's going to occur at every level. I
mean what I'm talking about for example, if I have a box and inside the box is a
spectral method, that's one of these patterns here, well, the spectral method I'm
doing a transform, I'm doing some computation then a bat transform. So now I'm
composing inside the spectral method several different routines. Well, now,
when you start looking at doing higher dimensional transforms there's
computation that would occur inside there. So it's all hierarchical and
composition I think will influence every lawyer. It would be nice if we could just
say I can only worry about it here. And I think often I might get away with only
worrying about at the architectural level, but to ultimately solve the problem
composition will touch every layer of the patterns.
Did you have a question or comment?
>>: I was wondering [inaudible] between the architecture and the computation.
Do you think those are fundamentally different problems that need different
solutions? Because that's what it looks like on this picture. Everything else is
kind of layered on top of each other [inaudible] you have.
>> Tim Mattson: Side by side.
>>: [inaudible].
>> Tim Mattson: So we have those two side by side and it's kind of artificial
because we're laying things out in a clean stack diagram because we have to get
it on to a two dimensional representation. And depending on the problem, things
may stack differently. But I do think you're talking about very different things in
these two cases, and the solution I think will look very different. Because here
I'm not saying anything at all about what is computed when I write the
architecture. All I'm concerned about is what are the major blocks and how are
they connected? Whereas over here it's all about what is computed.
So I suspect the architectural patterns will be pretty straightforward for us to
come up with some high level frameworks or a wizard to build them. It's easy to
think of some tools I can build that will support this architectural level.
>>: I [inaudible] I would say you [inaudible] more research done in the past on
the computation side because that's where all the HPC work is. On the
architecture side I think it's this graph diagram [inaudible] techniques.
>> Tim Mattson: Okay. I'll agree with you that from a research point of view in
the parallel world there's been very little in the architecture side. But I just when I
conceptually sit back and think about it -- I mean, the fact that I'm talking about it
as I'm drawing boxes on a white board, well it's pretty easy for a wizard to look at
that box's representation. I mean, heck, look at visual basic. You know, I mean
imagine a visual basic like front end where I just sort of pull down little icons and
connect them together and bam, it creates my framework for the high level
architecture.
>>: [inaudible] let's figure out how is data being shared between those boxes.
Of.
>> Tim Mattson: It's very difficult to take an existing application, reverse
engineer it into this. So -- and that's a process where we're going through now is
validating this design, building case studies by looking at existing applications.
So but I respond to that on two bases, in two directions. I gave it my response in
looking at existing applications and we're finding that it is very difficult. And it's
difficult because they compose together in complex ways. And it's also difficult
but it's also difficult because people writing software don't think architecturally.
We do a very bad job, especially in HPC. We've done a very bad job at thinking
about a software architecture.
In HPC and that's where I come from, we would use a monolithic architecture,
which means no architecture, we just start writing code and merging loops and
so I would say that part of what I'm looking at with this is people writing new code
and trying to influence people to write new code differently and that for writing
new code I would like them to think of an architecture as they moved here. If I
only solved the new code problem, I'd be successful.
>>: But I think even for new code the question remains how do you, what are the
right abstractions to build your architecture?
>> Tim Mattson: That is the question. And we think we've got -- we think we've
got them. Or we think we've got a great start at it. What we're doing right now is
testing that theory. Now, let me emphasize for those of you who haven't been in
the patterns community long, which is probably all of you or most of you, in the
patterns world we describe the process as pattern mining. So pattern is not
something you sit down and go I think this is how it should be. Patterns are
always mined from excellent solutions.
So if you look at the things here and you start reading them and you go gee data
parallelism that's not new, pipeline, come on, people have been doing that for
ages, yeah, okay, good. Good. If you look at this and you see stuff that's totally
new, then we failed because it should capture excellence in existing solutions.
And so that's what we consistently do. We go out and we look at excellent
solutions and we mine them and we understand the patterns and we either
validate that we have the right patterns in the right places or we validate we need
to add new patterns. And this is the process where we need to grow the
community and get more people in. Because with my book, there were the three
of us, and we came up with something very nice for the stuff the three of us had
seen many, many times, which is useful but it's not enough. What's enough is to
get a large community together and really boil down around a consensus of what
are these basic building blocks. Yes?
>>: [inaudible] solution [inaudible].
>> Tim Mattson: How do you know a solution is excellent? You know it when
you see it. Yes. [laughter]. That is scientific. Yeah. A community -- a
consensus in a community sees it as excellent. But, yeah, that is subjective. I
mean, we're capturing the human element here.
>>: For example like when you were saying that you didn't care about
correctness in the HPC world you care about performance, an excellent solution
would be one that [inaudible] but perhaps [inaudible] this is bad.
>> Tim Mattson: That's right.
>>: Not good.
>> Tim Mattson: That's right. You are absolutely correct in that excellence is in
the eye of the beholder and therefore there is a subjective factor. This is very
subjective. When you start talking about design patterns you're stepping over
from the world of the typical computer scientist where everything is clean and
objective and very, very systematic to fuzzy, things are vague, maybe you can
take it this way, maybe you can take it that way. But you know, that's critical
because what am I trying to do here, I'm trying to capture that human angle that
we've traditionally done such a terrible job of capturing. So I'm trying to get it and
write it down. But I need to keep going because we're almost -- well, we're
almost out of time and you're almost sick of listening to me, I'm sure.
So this is just an example and oh, my. I learned long ago if you oversee a typo
on a slide you have to fix it on the spot or you'll never fix it. Okay. So this is a
one slide example. And I'm not going to walk through it just because of the lack
of time. But looking at the CAD universe, you have an overall pipe and filter
architecture, so that's what you see with the boxes here. And inside the boxes
you have these computational patterns that appear. And that this one picture
actually summarizes a huge amount of what happens in the CAD space. And as
an example now of how we've boiled down the problem into a form that we can
go through and start systematically analyzing and working through instead of just
this very confusing mess that is the CAD applications.
So the patterns give us a way to precisely describe the computation which is
valuable for the experts sitting around talking about how do I support this in the
software world and how do I build an infrastructure to support it, and it's valuable
to the person new to this field who's trying to learn it. But it also gives us a way
to compare and contrast. Are remember that old programmability thing I started
with?
Now, remember earlier I could -- I just said, OpenMP is great, Windows thread
suck or OpenMP is awful and data parallel programming is great. Well, now I
can actually go through and I can look at different patterns and I can say, look,
how well does this notation support programming in that pattern? Now, it's still a
little subjective, but now I've laid my assumptions out on the table for us all to sit
around and argue about. So four or five of us sitting around with experience with
each one of these could discuss and argue it and trade off until we came to a
consensus on how these things played off. And so you can see if I'm doing
SPMD style programming, gee, everything works great. Okay. But if I'm going to
do, you know, an actor's type architecture, well, gee, MPI is great for that, MPI
can do anything, right? MPI is great for that, OpenCL would be absolutely
abysmal and OpenMP maybe I could do it but it really doesn't support it well. So
I give it a red box, even though I know OpenMP very well, I know how to trick
OpenMP into doing accurate but I'm tricking it to do it, so it's not fair to say it can
do it.
So this gives me a way, when I have a collection of programming models to
compare, to lay them out against a set of patterns that I care about and now I've
productively said something about these languages rather than I prefer OpenMP
and everything else sucks. So I think it's a huge step in the right direction.
And let's see. Gosh, I thought I had -- because we're running low on time. I can't
believe it. I lost it. It was a great slide, too. This is what happens when you pull
slides together at the last minute is that slide hidden. Oh, it's down here. Okay.
I'm sorry, folks. So let me just go on.
All right. Now, I want to emphasize that to us at the ParLab the patterns are a
means to an end, they're not an end in and of themselves. So I think they're
valuable and they're valuable if we're talking about programmability. But the
other thing that we really like the patterns for is they give us a roadmap for the
frameworks that will support the productivity layer. So in our vision of where this
will go, you know, the three years from now where we're near the tail end of the
five year ParLab period, you know, we're hoping that we get to a place where you
have this small number of hard core technology programmers that produce these
parallel programming frameworks. And I'll use the word framework very loosely.
It can be a language supported by a set of libraries supported by a wizard. I
mean, it's software technology to support parallel programming. And then you'll
have a domain literate computer scientist who understands kind of the bridge
between the hard layer efficiency and what a domain needs and they're going to
work at the parallel programming frameworks to create these application
frameworks. And then my end user programs who know nothing about parallel
computing and our goal, I don't think it's possible but the Berkeley people do is
that that domain expert double even know they're writing parallel code. That
domain expert just has his parallel application framework that's been given to
them and the parallelism is buried inside of there.
I don't believe that's possible, but I believe it's a good goal to shoot for, and I
believe if we only get 90 percent of the way there, so they may only say a little bit
about the parallelism and all the details can be buried, then we've been very
successful. But I'm hoping that the design pattern languages we're talking about
and we're working on will be the roadmap for building this hierarchy of
frameworks. And that's the end goal in the ParLab for the patterns.
And I'll skip that. And so the status of where we are with that language is we're
quite confident about the overall structure. It will need a little tweaking and
moving around. And we may need to change some definitions. But that picture I
showed you with the pattern language we feel pretty good that that's pretty close
to right. The top level patterns we have good drafts for most of those. The
bottom level patterns, some of them come straight out of my book and therefore
we have them, some of them still need to be written. But this term I'm teaching a
course at UC Berkeley where we're basically writing the rest of that pattern
language. We have the descriptions, we know what they are, but if any of you
have ever written a pattern, writing a pattern down is a very involved and difficult
process. It's very fulfilling actually. I mean I find it very satisfying because you
have to do a lot of research to make sure you really have found that nugget that
characterizes the good solution. And finding it and understanding it and writing it
down is like writing a good tight piece of poetry.
So what we'd like to do is really get more people working with us from a broader
community, we want to grow the pool of people that are working with us on this
language. Because that's the only way we'll be a consensus for how this field
works. So we have at Berkeley these all day pattern work shop. The next one's
coming up March 18th. Then we have one on April 22nd. So you're all welcome
to join. I would love to have the problem of turning people away because we
have too many. That would be great problem.
And then we're having a conference, paraplot, which we'll call for papers will be
going out soon. It will be in Santa Cruz 4th and 5th. Which is after the ParLab
summer retreat. So we're actively growing the community to wrap this thing up.
So we're running out of time here, so let me just mention briefly these other two
and I'll skip a lot of slides but these slides I can make them available to anyone
who wants to see them. But this is another one is having human centered model
of how programmers solve parallel programming problems. That's very
important. That's what patterns are all about. But it's not enough. We need to
have a language of programmability. Now, this came to me as I was studying the
psychology of programming. And if you ever want an interesting read to go out
and search out the literature in the psychology of programming, there's not a lot
of it, which I think is tragic and I think is a huge mistake. Because every known
programmer is a human being. So you would think that anybody who wants to
understand how to make programmers effective would want to understand how
human beings go about programming. But there is remarkably little research
happening on that front.
After years of working on the cognitive psychology of programming, one of the
founders, one of the real leaders in that community, Thomas Green, came up
with the idea that you know coming up with a nice, neat compact cognitive model
of human reasoning and programming wasn't getting us anywhere. What we
really need is just a way to talk about how notations convey information. And he
came up with this thing called the cognitive dimensions. And they're a way of
laying out and talking about the tradeoffs you make in choosing an information
notation and how they address the different problems in using that information
notation. And in describing something.
Now, one of the famous papers about using cognitive the dimensions comes
from a person I've never met, but I've read his paper many times, Steven Clark of
a company I think you've heard of quality Microsoft, but he wrote a paper or he
used cognitive dimensions to analyze C# back in the early days when C# was I
think still being developed. And so I mean it's a very -- it's been used with C#, it's
been used for the design of remote controls for televisions. I mean, information
notations can mean just about anything.
Now, I've got a lot of ideas, but let -- a lot of material here that I won't go through,
but let me just mention one of the cognitive dimensions. Probably my favorite
one and I think it says a lot about information notations. This is the idea that
viscosity is an information notation. How easy is it to introduce small changes to
an existing parallel program is how it applied here? So look at OpenMP. I have
this loop that I'm parallelizing and don't worry if you don't know OpenMP but I'm
just saying do this loop in parallel and do a reduction into sum and you better
make the variable X private or you're going to have a race condition.
Okay. If I wanted to change the schedule because this is a semantically neutral
clause, except for possibility of introducing races, but it's mostly semantically
neutral, so I can just change, if I want to change the schedule, just by changing
the schedule clause. So it's a very low viscosity notation. Whereas if I did the
same thing -- I'll get to you a second -- whereas if I did the same thing in the WIN
32 API and explicit thread API, I'm not picking on WIN'S Windows, but now I
have to -- the schedule is something I'm hard wiring in there. If I want to change
from a static to dynamic schedule, I have to make all sorts of changes.
So this is a -- a more viscus notation. Now, notice very closely, I didn't say one
was bad and one was good. You know, there's things I can do with this high
viscosity notation that I can't do with that light viscosity notation, but it now gives
me a way to talk about them, and talk about the tradeoffs between these two
styles. And to make a productive dialogue about that. So now ->>: What is one -- you call it semantically neutral.
>> Tim Mattson: Yeah, I know. They're relatively semantically neutral, yes.
>>: Does it matter what data [inaudible].
>> Tim Mattson: Yeah, then you put a reduction in there and then you have the
different orders and how you do the sums.
>>: I think the important piece is to understand -- to make a distinction between
thinking about the abstractions being used and making them into the language
using some [inaudible] so say if I look at the success of [inaudible] patterns they
will, good thing about them is none of them require you to change your language,
you can just use [inaudible] languages that are already there and you can
implement them all.
>> Tim Mattson: Right.
>>: No new syntax, nothing needed like that. So I think what happens in the
parallel world, people haven't quite advanced to that level, they are still revising
the language themselves. I mean, it should be -- we should be talking about
abstractions that can be done on top of the -- of some base foundation that
doesn't need to change every time we think of a new ->> Tim Mattson: So I -- I'm going to agree with you, though maybe not fully. But
I think my pattern language is an attempt to write down that set of abstractions.
And as we iterate through it and once we come up with the right pattern language
where we get enough input from a broad enough array of people, that will give us
the set of abstractions and then we can start to talk about what the language is.
These cognitive dimensions are all about syntax. And so it's almost like I have a
complete radical change of subject. Because now I'm not talk abstractions
anymore, not much, I'm really talking about syntax. But I can't tell you enough to
the boring application programmer like me syntax matters. I've had computer
scientists tell me time and time again syntax doesn't matter just get the
semantics right. Screw it. No. Syntax matters. See this apple on this
computer? It's their because you changed the APIs on Visual Studio and broke
all my programs. I've never gone back to Visual Studio since. Syntax matters.
I'm still pissed about that. You know. So this cognitive dimensions is all about
describing syntax. And because I know you're way past your threshold. I want to
get to this the nice picture.
So there's a whole range of these cognitive dimensions. And you know, you can
look them up, read the Steven Clark paper if you Google cognitive -- I'm sorry,
you don't use Google here at Microsoft, you use -- what's the Microsoft? Live
Search. Okay. Live search on cognitive dimensions. And you'll find there's a
whole list of them. And once again, it lets you -- I'm going to show you this one
here. It lets you productively compare across programming languages. And I
can say that, yes, OpenMP has a very low viscosity, MPI has a very high
viscosity. I'm not saying good or bad, I'm just comparing them. Okay? And I
can go through OpenMP because of that stupid shared address space is very
error prone. If you've written significant OpenMP codes you've written some ugly
race conditions and hopefully you have found them. Whereas MPI actually, you
know, MPI is actually not that error prone. It's why we like it. It's pretty easy to
get my program correct. I may have to do an awful lot of work to decompose the
data structures and you know, the point is this gets us past one is good, the other
bad. And it gets us to a point where we're talking about the features of a
language that make it good, the features of a language that make it bad. So last
thing I want to say, and then I'm going to shut up, is we have metrics of
performance. Why don't we have metrics of programmability? We need
standard programmability benchmarks that we can use and we argue about.
And my poster child for this is the [inaudible] problems. I think some of you here
have heard of them.
>>: [inaudible].
>> Tim Mattson: Yeah. I love this. And but it's old and dusty, and I think we
need to update it and expand it and get a broader community. And what I'm
thinking of is what I'd like to happen is the Berkeley folks did a really good job of
coming up with these 13 dwarves. So what I would like to do is get a group of
people together to define the 13 exemplars. So we come up with a
programmability benchmark for each one of these.
But in my mind, a programmability benchmark to be effective has to be defined,
paper and pencil like the [inaudible] problems, but there has to be a reference
implementation that is short and easy to read because this is the key. If I have to
read a 10,000 line program to compare one programming notation to another, I
can't even start. But if I can compare a 200 to 500 line program and that means
we're kind of moving into the danger of the toy code area, but I think we have to
be able to have these standard comparisons that we can make between
programming notation or we'll get nowhere.
So that's all I had. Thank you. So just closing, what I really hope happens is we
can start working together and we could, you know, think about it. We have this
structure at a Berkeley, it's a great place to come to work together where the
lawyers won't bug us. But you know, I want us to work together on a pattern
language because we need as broad a community as possible to make it work. I
think we'd all benefit if we built a discussion framework of programmability, the
cognitive dimensions are a great starting point. And then finally, I really would
love to see us come together and work on these standard programmability
benchmarks.
And I think if we don't do this, chances are we'll just add to that list of
programming languages and we won't necessarily create anything that anyone
will use. So thank you very much.
[applause].
>>: Any other questions to Tim? We're going to lunch after this just in the
cafeteria. And I think what's the afternoon special?
>>: We have a [inaudible] room 2209. And we [inaudible] about patterns.
>>: And that's from like one to two or something.
>>: [inaudible].
>>: Well, I'll ask one question because I was going to ask you a question on the
metrics [inaudible] and the slide you did show was a little different than what I
was thinking as far as metrics. So as you distill down, right, and you take a
higher level concept into a lower level layer, right, how do you judge whether or
not your choice to that lower level is a good one to [inaudible] I mean how do you
calculate sort of the error, the inefficiency, you know, how much you've lost or
how much more you can write you know as you break down. Because you go to
the lower level, I mean, [inaudible] choices right of how you implement more of
the higher level things. And so how do you decide whether you made the right
choice, the wrong choice, what the ->> Tim Mattson: Right. So how do you decide as you move through the layers
of the pattern language that you've made a right choice or a wrong choice. Yeah.
So there's two things to point out in your question that are important to
appreciate. And one is that there's no single path. And that's the issue. How do
you know you have the right path? I submit that it's iterative. You go down all
the way through to the lower layer, now you have a stack of paper with boxes
and diagrams and little notations and you get to the bottom and you go, gee, that
was, you know, is there a better way to do it, and you have to iterate back
through.
It's not an objective, you know, that you can assign a numerical weight to each
branch along a tree and you can say, a, yeah, this is the lowest cost path. We're
talking about the human process. It's fundamentally iterative.
>>: So you have no metrics at the higher level to judge the branching until you
get to the bottom level and then you just have performance and scaleability?
>> Tim Mattson: Yeah, you get to the bottom -- well, usually because we're
talking about software design, so hopefully you get to this before you start having
code running and performance metrics. But you know the whole idea is you get
down to a design cycle is it exposing much concurrency. And if you look at my
book, we talk a great deal about that iterative nature of how much concurrency
are you exposing, is this reasonable relative to the target platforms, you know,
we like to talk about being architecturally neutral, but fundamentally at some
point you have to break that architectural neutrality and think that gee, I'm going
to be on a Windows platform with a laptop with a 16 core processor. But it is
iterative. And it's fascinating to me if you do look at the psychology programming
that iterative nature comes out in study after study after study that we teach
people in school a top down method of software architecture. But the fact is that
no good professional programmer programs that way. They always -- the really
good productive ones, they always sort of iterate down a little bit, then jump back
up, then iterate down again and jump back up. And therefore a notation and a
framework that supports them has to support that iteration.
>>: All right.
[applause]
Download