>> John Feo: Okay. So today we have Tim Mattson from Intel to talk to us about parallel programming, particularly parallel patterns. Tim describes himself as an old applications programmer. I've known Tim for I don't know, at least ten, 15 years and that must make me older, but at any rate he's worked on a wide variety of machines going all the way back to the cosmic cube and the Cray 2 and the vector machines and the parallel platforms and a variety of languages, everything from Linda and Strand and HPF and MPI. So lots of experience. And I think that's the best way to really, you know, understand why parallel programming is difficult and different alternatives. Currently Tim is working at Intel since 1993 and he's their evangelist for parallel computing. And his current interest is in design of pattern languages for parallel programming which he'll talk about today. >> Tim Mattson: Okay. Thank you. So well, thank you -- first off thank you for inviting me to come here and talk to you folks. I'm just based down the road in DuPont, so it's fairly trivial for me to just bounce on down here and talk to people. So it's funny that I almost never get here. But hopefully that can change over time. I suspect this might be a little bit different than many talks you get in that this is an idea's talk. I'm throwing out some ideas. I have some firm beliefs on what we need to do to solve the parallel programming problem and some ideas on how we can get there. But we're not there yet, and so I'm not going to have a specific programming language and specific technologies. But hopefully I can lay out a picture for some of the things we need to do and should be doing and my hope is is that at least one or two of you out there at the end of this talk will go, yeah, yeah, let's work together to make this happen faster. And so I'll have more about that in a second. But I am from Intel and therefore I always have to start with a disclaimer. These are my own views, they are not Intel's views. I did not have Intel's lawyers go over this talk in advance, so if I offend you or make any of you upset, blame me, don't turn me in to Intel, please. Also, I work in a research group. There's absolutely nothing I will say that has anything whatsoever to do with an Intel product. I just love that. And I have tried my best to keep this talk completely free of Intel IP. So I think I'm safe. So let's go ahead and get started here. The mini core vision, you know, a year ago I had to have three or four slides introducing this. I don't anymore. We all buy in. Moore's law is going strong and you know it amazes me that we're already demonstrating samples at 32 nanometers so we are really stick to go this schedule. And to really understand what this means is by 2017 we should be making it to the eight nanometer process technology, which is mind boggling even to contemplate, and that means 32 billion transistors will be the integration capacity. And what this all comes down to from Intel's point of view is what the heck are we going to do with all those transistors. Because the day people stop valuing our transistors densities our business model falls apart, the business model in the industry falls apart all hell breaks loose and life becomes miserable. So we've got to figure out how to keep those transistors being at high value. And you all know that means mini core. And I want to emphasize I'm trying to stop getting -- to get people to stop thinking about multi core which is where you take SMP technology and put on it a single chip. You think mini core where you have general purpose coarse, special purpose cores and an interconnection fabric that ties them all together. It is crystal clear that this is where we're going. And therefore from a software point of view, from a hardware point of view, this is the direction we need to start thinking. So how many cores in mini core? Well, you know dual core 06, quad core 07. I had the unique privilege of being responsible for the software team on this 80 core chip so you know, we're already in the dozens and hundreds -- you know, it's possible to build dozens of cores. How many cores we ship of course depends on what the market's ready for, and of course the market is not ready for a beast like this, but we already have good idea on how to build them and are continuing research on how to figure out how to make them really practical. But this is not a new territory for Intel. We've been doing hundreds to even thousands of cores, you know, spread out over many machines. We've been doing it for a long time. And I want to emphasize this because I think some of you, especially you younger folks, I see some gray haired folks out this who, you know, who have be at this for a while and have a real understanding that this is really, really old. But I think some folks especially new to the field don't appreciate just how old it is. Intel shipped a commercial hyper tube in 1985. I think N cube just beat us to the claiming to be the first commercial hyper cube. We were right after them. So 1985 is a long time ago. And you know, we built some of the first gigaflop machines, eventually we built and for some reason my picture doesn't show, oh, man, there it is, why didn't it show the first time, you know, eventually by '96, '97, we built the world's first teraflop machine with over 9,000 processors. So, you know, we know the hundreds to thousands of cores range. We know what it takes from a hardware point of view, from a system operating system point of view, from a parallel programming point of view. This is old territory to us. And of course what it got us was membership in the dead architecture society. And I really think it's important for people to keep coming back and looking at pictures like this, because it's a been there, done that before, you know, in the late '80s and '90s everybody was had to go parallel, it was the big band wagon, we all jumped in, and I recognize some of you probably were attached to some of those companies that failed because as you notice, there aren't very many arrows continuing to modern time and the companies with arrows continuing to modern times like this one and SGI don't look anything at all like they did back in the heyday. I think we can only conclude that parallel computing is toxic and are we utterly nuts to be going back into the space right now. And I think if we don't spend some time in asking ourselves why did it fail in the past, if we don't understand that, we have no chance of getting it right this time around. And I really would like to get it right. So I think clearly what went wrong -- there we go -- was the software. You know, we figured out how to build wonderful hardware back in the heyday of massively parallel computers, but the software never showed up. Now, in the national labs where you had folks that could spend and do whatever it took to get the software, they had all the software they needed, but we firmly believed back in the late '80s and '90s that every engineering company from the oil industry to the automotive industry to the pharmaceutical industry, that they would all need and have a tremendous appetite for these systems. But the software never showed up. Now, so obviously what we have to do is think about this as a software problem. It's not a hardware problem. We can build all sorts of weird hardware but the software can't take advantage of it then it's useless. Can we generate the software automatically, parallel software automatically? It amazes me that even today I hear people talking about how we can have implicit parallelism where you describe the algorithm declaratively at high level and then some magical tool will turn it into a parallel program. Look. We know that doesn't work. We spend decades of research trying to make that work. Why do we possibly think it would work this time around, I don't understand. It's a valuable research approach, I'll get that second, it's a valuable research approach. I like people to keep doing research and thinking about it, but I'm not going to bet the future of my company on someone figuring out how to make this magic work where I can express my program and can automatically discover the concurrency. So I just don't think that's a line that I see being productive. >>: [inaudible] I wonder, though, is it really [inaudible] being able to write the software? What about the data? I mean, you don't have the data to feed this monster, then all this [inaudible] were they really at the point where they knew, you know, how to represent their data, put it in a form that they could really compute over it? I mean, it seems like the big difference between now and then is we have a heck of a lot more data. >> Tim Mattson: Well, I ->>: We also have more standard ways for talking about it. >> Tim Mattson: So your question is, was the issue really a lack of parallel software, was it the inability of handling the problem at the data level and having the data that could take advantage of it? And I guess I have to -- I mean, I think I understand where you're going with that. But in several industry sectors where we tried really hard to get penetration to parallel computing and failed, and they had plenty of data, so I'd look in the -- in several of the engineering disciplines, and in the pharmaceutical sector where I did a lot of work. The data wasn't the holdup. They had the data, they understood how to format the data, they had the big problems with the big data sets. Moving it around was hard. I mean, the old saying was an old MPP was like feeding a super tanker with a drinking straw. So there were those bandwidth issues. But in terms of did they have the data, did they have the problems, did they understand the data and how to address them in these large parallel machines, that I don't think was the issue. Yeah? >>: I mean, I really want to come back was it really software or was it just not cost effective at that time? Because you industry after industry you could say oh, yeah, it seemed like you know they should be doing this but maybe the failure was you guys just didn't make it cost effective. >> Tim Mattson: No, I -- go ahead, John. >> John Feo: I think it's a combination of -- I think the problem is like you were saying, it was getting the data to the process I think was the difficult thing, okay, and I think that's where, you know, the machines didn't perform well. And you could say well we didn't have the right tools in order to sort of collate the data and the operations or maybe we didn't have the right hardware, which made it difficult to move the data to the processors. So I think it's that -- the data movement, the data layout problem which was A, difficult to do, and, B, caused the performance and scaleability to be so poor. >> Tim Mattson: Yeah. Respond to that second ->>: How much of that was they couldn't do it and how much was simply that you could do it cheaper on commodity hardware [inaudible]. >> Tim Mattson: No, they're not commodities. Okay. Wait a minute. Wait, wait, wait. >>: No, but ->> Tim Mattson: A commodity is defined as a product that there's no distinction on what source you buy it from other than source and a microprocessor is not a commodity. Though people commonly call it that, but there are really differences between an AMD and an Intel. At any rate I'll get that out next. [laughter] what bugs me is when people at Intel call them commodities. And it's just like no, I know the market calls them that, but it's very important to us businesswise that people don't think of them as a commodity. Okay. So several great ideas here. You know, obviously I'm oversimplifying. So yes, there were problems with the data issues made these machines impractical. Yes, part of the problem was that the -- that Moore's law was just, you know, the microprocessors coming out. Once we hit our stride with the pentium pro in the mid '90s and then that follow-on, so finally we had a processor that really could do floating point well and fast, that created a pressure. So that was a factor. But let me respond to the software problem. So because my role at Intel has always been the applications person who can speak the language of the scientists and who can speak the language of the computer and software architects. And in the pharmaceutical sector where I work very, very closely, they had a whole slew of programs they used all the time. And to the end user medicinal chemist, you know, they had their productivity software that they were using in their domain, their molecular dynamics, their quantum chemistry, their process control flow applications and pieces of them would run on these machines but you wouldn't get the whole thing running. So it killed the conversation about adopting parallelism before it really even got to issues of data movement and is it competitive with other opportunities. You know, the fact was the software wasn't running on the parallel machines, so it made it really hard to go anywhere with these conversations. So all of these factors were important. But we couldn't even get to the point of let's work through those factors because the software wasn't there to start the conversation. So it's -- any rate, the software was a huge -- was a huge, huge part of it. All right. So it's a little bit controversial in some circles, especially the younger folk are trying to really push this thing of just a magical tool will solve the problem. But we know better now. I mean, we've got years and years of research. Our only hope is to get programmers to write parallel software by hand. People have to create parallel software and at some level they have to do it explicitly. They have to describe the concurrency and they have to describe something about the way that concurrency is going to be managed. So the solution of course was let's create a new parallel programming language. And this is a slide I just love. I went through the literature and tried to gather together a lot of the important languages you could find in the mid '90s. And to give you an idea of what parallel programming nerd I am, these are the ones that I've at some point written software with. So -- I mean it's just ridiculous how many. Now, what I want to assert and what I want to warn you all about is this was really, really bad. This was a bad thing. And I want to convince you of that. And ask the question, did this glut of parallel programming languages help us or hurt us? And I want to describe what I consider to be one of the most important computer science papers of the last few decades. This is a Draeger grocery store study. Do people know -- have people seen this study before? Okay. Good. This is really important. And since Microsoft is such a technology engine, I want you to really internalize what this study told us. So this was done by at Stanford and it was in their -- it was a marketing research group at Stanford. I forget what department they're in, but what they did is Draeger, the Draeger grocery store any guess it's Palo Alto or in that neighborhood is one of these gourmet specialty grocery stores that has just the really high end specialty items. And they set up two displays of gourmet jams. One display had 24 jars, the other display had six jars. And the idea was that a customer could come up, sample the jam and get a coupon for a dollar off. So if you think about this, what they could do is they could count how many people walked into the store, how many people sampled the jam and how many people eventually purchased. So they could easily gather that data. So then what they would do is they would randomly swap the displays and they would train graduate slaves to stand there and to properly you know present the product so they could see how these two different displays worked. Now, what they found is that the 24 jar display indeed attracted more people. 60 percent of the people who walked in the store walked up and would sample at the 24 jar display, whereas they only got 40 percent of the people with the six jar display. What is really critical and what completely shocked them when they did this study is that 30 percent of the people who sampled actually purchased with the six jar display and only three percent purchased with the 24 jar display. Now, this has been confirmed time and again. They've went on and they did other studies. So this, you know, one could argue that well, this is just jam, it's frivolous. But it's been confirmed with 401 K plans at businesses. They would look at companies with this huge complex array of 401 K plans that you could voluntarily contribute to or companies that had three or four and they found the company with three or four actually had higher participation rates. So this has been confirmed with things that matter a lot more than jam. And they coined the term that what's going on here is choice overload, that there's a natural human tendency when you present a human being with too many choices they'll just walk away. It's like it's too hard to make a choice now, I'm going to go away. So now keeping this in mind, what do you think the response was to this? >>: [inaudible]. >> Tim Mattson: I really think that back in the '90s we hurt ourselves pretty seriously, and I don't want us to do it now. I think it's a good thing that today -now, there's an HPC bias here, I understand, but today really there aren't that many parallel programming models that are heavily used in HPC. You know, there's of course the people who do hand threading, WIN 32, API or posits threads in the UNIX or Linux space. If you want to do compiler directives, there's open MPI. If you want to do message passing there's MPI and frankly MPI trumps all the others in the HPC space. And then there's these new kids on the block, CUDA has really taken off, but CUDA will rapidly transition to OpenCL because OpenCL is a standard, it does functionally pretty much the same stuff as CUDA but it's an industry standard so they're not tied to one vendor. And interesting [inaudible] seems to really be gathering momentum and it's kind of the exception that proves the rule because it emerged from an end user community. It wasn't computer scientists or people at a chip company sitting down saying this is how I think you should program, it was a telecom community had a problem and they created a language to solve their problem. And then they rolled it outside. So it's very interesting how that one evolved. But the fact is there's not very many today. And since choice overload is a very really phenomena, what I caution us in the industry on Intel, Microsoft, all of us out here working on this problem, is new languages for research are great, you know, please spin off as many languages as you can and keep them inside your walls and study them and figure out what works and what doesn't work. But when you're ready to deploy to the outside world, less is more. We can actually damage ourselves by spinning too many out. This worries because at Intel right now I'm aware of seven different parallel language projects, and I shutter to think if we try and turn them all out. I bet you guys have even a lot more than that. Yeah? >>: [inaudible] that used to have a department of product prevention. >> Tim Mattson: Oh, really, deck had a department of product prevention. I love that. That's good. Okay. So what I would rather see people do, the tendency in parallel computing historically has been if you see a language you don't like, make a new one. And I ran into this back when I was at Yale, I couldn't attract graduate students very well when I was in the faculty at Yale because I want to do research in usability, I want to do research in how effective different languages were. And they realized they'd have a tough time when it was time to get the PhD if all they knew was about how to use languages. No, but if they created their own language, yeah, yeah, that was considered PhD worthy work. So I really think we need to take on a different mind set, and we need to think of how can we fix what we have today before we turn to something all new. So classic example of this is OpenMP2.5 could not handle the simple pointer following loop like this. It was very awkward what you'd have to do to deal with that construct. So rather than throw OpenMP away we amended it in OpenMP3.0 so like all things in OpenMP, the parallel says give me lots of threads, the single says the following block between the curly braces only one thread will do. So this thread goes through and it creates tasks, separate tasks to process this loop. So now I extended the OpenMP and made it so that I can handle these kinds of structures which are very common in modern software engineering. We fixed the API rather than throw it away and create something new. And I would really -- you know, this is less sexy than coming up with the new whiz-bang Tim MP, but I think we really need to think that way of fixing what's out there instead of creating things that are new. I might add when I look at language design, I look at C, C++ and Java as all one family because to me the programmer syntax matters. What you do under the cover, whether it's a virtual actual machine or a just in time compiler, I could care less. All I care about is the code I write. And I love the fact that C# looked at that same tradition and said how do we make it better. I mean that's the kind of thinking that I would like to see extended to parallel programming. Take what's out there and accept it and fix it. But patching up old APIs is not enough. I think that's a critical step, there's more we have to do to solve this problem. One issue is what I call the parallel composability problem. Modern software is not written the way old scientific software was, meaning, you know, in the old days what we used to do is you'd have a team at some lab and we would program would come in and the team would sit town and roll up our sleeves and we would parallelize it and tune it. Now, of course software has written components, they come from many sources, they come from many languages. So the parallel programming idea, assuming everything is under one umbrella and one program just falls apart and you have to be able to compose individual modules together. I mean a good example of this is the MKL library. This is the math library that Intel has. Uses OpenMP. And if you had an OpenMP compiler built with -- I mean an OpenMP program built with the Intel OpenMP compiler calling the MKL which uses OpenMP you could not fit them together consistently. What I mean is you would have oversubscription problems, you would have all sorts of problems because there's no concept in the design of OpenMP or the supporting run times for how to compose modules. So this is a huge problem. Now, I think the only sorts of people who can solve it are people like companies like Microsoft that control their infrastructure. And I know you have a common concurrent runtime project and I saw a presentation on it and I felt like jumping up and applauding because that is the only way we're going to solve that is if things mapped in on to a common runtime because then you can do resource management and handle the oversubscription problem and you can handle data locality issues that cut across modules. So I'm assuming you guys are going to solve the composability problem at least in your universe and from Intel's point of view, that's good enough, that's pretty cool. So that's great. So I don't need to deal with that one. But the other area which is how do we intelligently evolve parallel programming languages and APIs so that they work for the mainstream general purpose programmers? That is a second problem. And that's one where I think standards and working across the community is key. And this is something that Intel's really good at because you know we're software neutral and we can bring people together from very different persuasions. And because of our nature sitting at the heart of the computer but you have to someone else's GPU often sits there and someone else's vendor's actually selling the machine, we're actually this a very good position to work on this problem. How do we figure out how to intelligently evolve parallel programming infrastructure? So that's what I'd like to talk about. Now, when I looked at the history of parallel programming and all those huge slew of languages, you would think we would have solved this problem by now. But we haven't. And I am convinced and it takes me a long time to develop why I'm so convinced of this, and I won't do that now, but I could walk you through it at some point. I'm convinced a huge part of it is that we did not approach the problem scientifically. We did not approach the parallel programming scientifically. We approached it as engineers. If I had a new idea for parallel programming language what did I do? I found some graduate students and we created it and we wrote some codes with it, then we published a paper, and we jumped up and down and padded ourselves on the back, and then we'd go off and we'd do the next one. And that's very effective if you're building bridges and buildings, it's not effective if you want to develop a theory for how to approve and evolve a technology. For that you need some kind of systematic basis for comparing them. Basically survival of the fittest has been proven to be an incredibly effective mechanism but you have to have some way of deciding what's the fittest. And that means you have to have a way to measure and assess and talk about parallel programming languages and compare them. I think that's one of the fundamental things we need to do. And that's one area I would love to work with you folks on. So how do we compare parallel programming languages today? And I'm not going to pick on a friend of mine at Intel because he has a slide like this he used to carry around a great deal. And I'll leave his name anonymous though some of you may know who he is just from this slide. But he'll talk about his super cool whiz bang nested parallel data language and look how superior this is to OpenMP and man that says absolutely don't you really want to use that nested data parallel? But what does this really tell us? It doesn't tell us anything. He doesn't explain why the OpenMP code was written that way. He doesn't tell us any performance data. He doesn't give us anything. All this tells us is that he prefers nested data parallelism to OpenMP. So okay, we get that. I don't know how valuable that is, though. So frankly I think my colleague should be absolutely shamed for every using a slide like this. Shame on him. Shame on me. This is a slide I often use in my OpenMP talks. Look how nice and elegant this OpenMP is, but gosh if I did it with the within 32 threads API, oh, my God, you don't want to write this crap. No way. This is much better. So the point I'm trying to make is look we're all guilty of this. We're all guilty of engineering by pot shots. And that's got to stop if we want to evolve the state of the art then we have to support the state of the art with systematic and careful sense of comparisons. So I'd like us to do it right this time around. I really feel that we're on the cusp of a second chance. We blew it in the '80s and '90s, we meaning those of us doing research in parallel software. We created some great stuff like sizal [phonetic] and Linda and Strand and wonderful languages that nobody used. This time around I'd like us to secrete stuff that actually works and solves problems and people really use. So I'm convinced that what we need to do is create a systematic scientific method for working on software. In other words, I can put out a hypothesis, I can test it, I can peer review it, I can understand what worked and didn't work so then I can feed that back in and over time evolve systematically to a parallel programming technology that will really work. That procedure has never been done consistently in parallel programming. And that's what we have to do. And I break it down into these three steps. We first have to develop a way to talk about how people solve parallel programs. We have to develop that human angle. What are the ways people attack a parallel program, what are the patterns they use in building a solution, we have to create a way to understand that and talk about it. Then we need to come up with a jargon, a terminology for how we talk about programmability. So that when I'm sitting in a room of experts, I don't say gosh, F sharp is just a nightmare, I couldn't imagine using that. Oh, yeah, well OpenMP sucks. I mean you got to get to the point where you can really talk about what's good about something, what's bad about something, how the tradeoffs are made. So we have to create that terminology for talking about programmability. And I'm going to close just briefly with the plea that we have to come up with metrics and programmability metrics. So let me go ahead and walk through this. Now, we found in the early '80s when -- early '80s, early '90s. Gosh. You know you're getting old when you get off a decade with things. In the early '90s some of you older people here will remember the emergence of object oriented programming. I mean the technology was really, really old but it took off in the late '80s and early '90s. And in the early '90s I would characterize the object oriented programming world as utter chaos. It was chaos because we had these different languages out there, which isn't bad, small talk, objective C, this weird thing called C++, but there was really no idea how to really use them. So you had these horribly expensive projects, the one at Mentor Graphics is the one that -- because they're right next door to where I was living then, that I was most aware of, where they spent millions and millions of dollars with teams of engineers redoing all their software in C++ and then it didn't work, they could never get it working right and once it did, it didn't run very fast. Because nobody understood how to really object oriented programming. And what you had to do was write a whole bunch of programs and fail many times and then eventually you'd get it and you could do object oriented programming and have a lot of success. So a book came out in the mid '90s, early to mid '90s on design patterns. And this is a famous book. I have a feeling if I went around to your cubicles most of you would have this book on your bookshelf. It's the gang of four book on design patterns. And it also overnight got the object oriented programming field to grow up. Because now all of a sudden if you were new to the field, you could read this book and you could know the tricks of the trades that the experts took for granted. And that was really valuable if you're trying to learn object oriented programming. It was also really valuable because if you had a room of experts sitting around, they could say, gee, what do you think we ought to use here, well, maybe the factory method. Oh, yeah, okay. So you have a jargon that the experts could use when talking about their field. So really it was amazing to witness it as a software engineer out there earning my living writing software how almost overnight object oriented programming went from chaos jungle cowboy land to systematic engineering field. Now, a design pattern is a solution to a problem, a recurring problem that captures the excellence in the solution to that problem, so you mind patterns from excellent solutions. It's not necessarily anything brand new, it's just codifying what the experts have worked out and make it putting a name on it, writing it down a standard way. Now, this book was a catalog of patterns. If you go back to the origins of patterns, they talk a lot about a pattern language. And I want to emphasize the importance of a pattern language. A pattern language has the idea of a flow through the language. Patterns interlock, they fit together. I have high level patterns that lay out the solution at a high level lower level patterns, they hierarchically nest and you flow from high level down to low level patterns as you go from the start of a problem to the end. So the catalog is as I said, you just look up and like that's my pattern. A pattern language is I use this pattern which then flows me to this pattern which then flows me to this pattern. So pattern language includes this higher level knowledge on methodology that's missing in this book. So what I did back in the starting in late '90s and it was published in 2004, is I attempted to do for parallel computing what the gang of four did for object oriented programming and I wrote this book with Beverly Saunders who is vastly smarter than me and comes from the formal verification community. So she -- I want to do things fast, she wants to do them right. So we actually were a great team. And then Berna Masingale, who I met when she was a post-doc at Caltech, an old applied mathematician -- well, she's not old, but an applied mathematician who does lots of work in parallel programming. So the three of us wrote this book. And it is a full pattern language. And one of the fun things about writing it is the examples are in MPI, OpenMP and Java. So it was really fun to take a pattern and see how you do it in these different languages. However, the three of us basically come from a scientific programming background, so if you read this book today, you would see that it does a really good job at capturing how the whole HPC era of parallel computing did things so. In some regards it's a little bit narrow on scientific computing even though that wasn't our goal. And we've learned a lot since we wrote this book. So what I'm doing right now at the ParLab at UC Berkeley is to do the next generation, to kind of throw this away and let's do it right this time but pay a lot more attention to bringing in a broader community of programmers. So what comes out at the end isn't for scientific computing, it's for all computing. It's for general purpose computing. Now, is everyone here familiar with the ParLab at UC Berkeley? Some yes, some blank stares. Let me show you the one picture about the ParLab. So the ParLab, this is the group at Berkeley that Intel and Microsoft are funding through the UPCRC, which I can never keep all the letters straight. But we are together funding this company to the tune of a big pile of millions of dollars. All right? So you have a whole team of very smart people who are really excited about making sure that Microsoft and Intel are happy so five years from now maybe we'll give them more money. So I mention that, though, because I have found to a remarkable degree they're very, very open to someone like me from industry coming in and saying gee, have you thought about doing it this way, and they've changed, they've made huge changes in how they do things. So I'm really pleased to the extent at which they're looking for input. We're not just money bags, they really want us to be direct collaborators. And I am very closely and directly collaborating with them. So the way they're working at things is they have at the top a collection of applications that are driving the research. The problem is to bring -- that they're addressing is to bring the world to a point where you write correct software that runs efficiently on mini core and they like to say that they're addressing the single is socket problem. So they're not trying to address the HPC problems and the cluster computing problems. These are all very interesting, but let's just focus on the parallelism that sits in a single socket. So how do we make that software routinely parallel that's correct and efficient? So they have these applications at the top, image, retrieval, hearing and music, speech, et cetera, and what we're doing is we are analyzing those applications and we're going to create an understanding of how humans think about them and how humans parallelize them in terms of a design pattern language. >>: [inaudible] a bunch of cores on a single chip, is that what you mean by single socket? >> Tim Mattson: That's what I mean by single socket, yes, so lot and lots of cores on a single chip. Which -- I mean I gave you the word they used but it's not quite right, it's almost better to call it a single board, because they're also really into you got a GPU sitting there, let's use that, too. But the thing is a lot of these people have relationships up the hill at the labs where everything is huge HPC, so they need a simple slogan so when the HPC come to them they can calm them down and say no, no, no, we really mean your laptop. Your laptop. Your handheld device, the single platform. >>: [inaudible] you have shared memory or not, right? [inaudible]. >> Tim Mattson: Well, maybe. But I don't submit you have to have shared memory. It's not really -- I mean, yes, today, the single socket you tend to have shared memory between all the cores but nothing says it has to be that way. So my research at Intel is doing distributed memory cluster on a chip and I'm finding that's an incredibly productive architecture both from a software and a hardware point of view. >>: So that would say that distributed memory is as hard as HPC? >> Tim Mattson: Yes. >>: [inaudible]. >> Tim Mattson: This problem is definitely as hard at HPC. In many ways it's harder. And it's harder for sociological reasons. At HPC I have programmers who actually enjoy, enjoy the difficulty. I mean, you just have to visit one of the national labs and sit around with the programmers and you'll hear them go like yeah, man, it was great, I was up till two a.m. getting that routine debugged and I got it from 97 percent to 97.5 percent speedup, it was just great. They like that stuff. But here, you have really world programmers who have deadlines to hit and they have to add features and ship on time and it's -- so the sociology is totally different that these people will not take the pain as the HPC. >>: The memory may be shared but on the cache is shared because you don't want the cache all over the place. >> Tim Mattson: So that's a separate issue. And from a hardware point of view that is the crux of the issues we're working on right now at Intel. So a lot of the -I mean, the Intel projects today -- Intel products today are shared cache. >>: [inaudible]. >> Tim Mattson: Not the L1s. They have their own L1 but they have a last level cache they share, okay. So you have L1, sometimes even an L2 that's per tile. That 80 core chip that I worked on was shared nothing. There was -- there was no sharing at all. I have a research project going on right now that we haven't released in public so I can't say much about, but it's kind of a hybrid between it. There's some sharing, there's not some sharing. So I mean we're basically exploring the design space very broadly, which is why I responded to your comment about well it means shared memory. Well, it doesn't have to mean shared memory. And when you look at the overheads at managing a shared cache, you quickly get to the point where you're eating up that bandwidth on that on-chip network just keeping track of who has which piece of the cache. And at some point you have to wonder, wouldn't we be better off to just get rid of the shared cache? Now, I know I'm getting off on a tangent, but I think this is an important tangent for people thinking about how to do parallel programming. I have found anecdotally when I sit down with old timers like me who have done tons of shared address space programming and message passing programming, we prefer the message passing than shared address space. Why? Because in message passing, any time sharing occurs it's explicit in the text of the program. I know sharing's occurred, and if I don't have that send and receive or communication routine then I know I have isolation. Whereas in shared address space programming it's the opposite extreme. There may be sharing going on that I have no clue about and there's nothing in the text of the program to tell me. So I think there's this illusion out there among us computer scientists pushing parallel programming that shared address spaces are a good thing, and I don't believe it. I think in fact they may be really bad thing and we may need to really move away from them. So shared memory may be okay, but I like the idea of a partitioned global address space. There's an address space you can get to the memory it's shared but it is apparent in the text of the program when sharing occurs so there's a discipline on how that sharing occurs. So -- and keep in mind as I say that, I'm Mr. OpenMP, you know, I've spent the last ten years working in OpenMP in a shared address space. So in some regards it hurts me to say this and my OpenMP people throw things at me when I say this, but I really do think some day we will look at this whole shared address space trend that started back in the '80s and '90s and just think it was a big mistake and throw it away. So I don't necessarily think it will be shared memory. At any rate -- yeah? >>: [inaudible] a shared name space has been the foundation of traditional sequential composition on which we've built layered software. If we throw that out, are we throwing out all our notions of modularity and layering that went with it? >> Tim Mattson: No, no, I don't think so at all. >>: How about all the current software that's built on top of those notions and layering? >> Tim Mattson: Well, that becomes very difficult then, doesn't it? Yeah. >>: [inaudible] he's the invited speaker. [laughter]. So we'll let him say thinking that we can. >>: I know but in [inaudible] caching memory there's some ramifications. >>: I'm keeping quiet. >> Tim Mattson: There are huge ramifications. At any rate at Berkeley then you have the applications at the top driving things, we have a design pattern language that captures the idea of what those applications are doing, what the concurrency is, what the parallel algorithms are, then you have a lower layer that they call the productivity layer. And the idea is that you have a large universe of programmers who are focused on just getting the job done, they're usually closer to a domain of expertise, not hard core computer scientist, definitely not hard core concurrency experts. So they're living up at this range here in the productivity layer. And the hope is that they can do a lot of what they need through high level languages, maybe declarative languages, parallel libraries, high level frameworks, you know, this is where you really need to raise the level of abstraction and make it easy for the general purpose programmer. Then there's a smaller number of programmers who work at this efficiency layer. They're the ones who I would -- these are the traditional HPC people. You know, the advantage we have in the HPC world is everybody was in the efficiency layer. So we didn't even have to worry about the productivity layers because we didn't care to be productive, we just wanted to be efficient. So you know, MPI low level languages, threads, they worry about how do you get the performance up on the library so auto tuners to search parameter spaces. So these are the people at the efficiency layer who worry about getting every last ounce of performance. The productivity layer, they'll trade off performance for the ability to quickly engineer software. >>: I wouldn't say that the HPC people don't care about this, they care greatly about the [inaudible] but they'll always throw it overboard for an extra [inaudible]. >> Tim Mattson: You're right. The HPC people -- I know, I'm being slightly facetious. They do care about productivity, but they won't take -- I mean they'll say things like I need a new programming language, it's got to be easier to right software, but then when you give it to them they throw it away because it cost them a few percent. So I often say that the HPC people are just a bunch of whiners and that they should just shut up and sit down with MPI and leave us alone, because when all's said and done you just give them MPI and they're off and running and they get their jobs done. And they do beautiful work. So I'm frustrated by them, but on the other hand that's kind of my people. I understand them. Yes? >>: The two layer seem to pretty much all the database people who have the database manager and the database user, how well does this computer? >> Tim Mattson: I have no idea how this compares to the way the database people break things down because I'm not a database person. >>: You're not. >> Tim Mattson: So I would imagine that if I looked at different slices of computing that have a similar breakdown. And in fact, I was at an OpenCL meeting recently and one of the interesting the things for me about OpenCL is we have participation from game vendors. So I actually sit down and talk to really gaming software developers. I never talk to these people. And they were talking about a technology programmer that a very key thing in them producing a game is that the time from a technology programmer's really, really expensive. Well, when I probed them on what they meant is the technology programmers what they call the efficiently layer guys and it's about one percent of their programming staff, and they pay them a lot more, so anyone looking for job security, you know, learn how to be in this layer. They get a lot more money and they do everything in their power to minimize the amount of time they have to go to those technology programmers. The bulk of their software developers are up in this productivity layer. And that surprised me because inside Intel we'd always been told that, you know, well, game developers, they've got to hit those realtime constraints, they're very performance centric. But actually increasing their moving to not just C++ but scripting languages, Pearl and Python and it's really bizarre. But at any rate, they too were talking about this split. But the Berkeley crowd we think 90 percent are productivity, 10 percent are efficiency. In this gaming community they were saying 99 percent were productivity and one percent were efficiency. So it seems things are naturally migrating to the split. And I've talked to people in the business world and they've similarly said there's this split somewhere between 90-10, 95-5 percent between productivity and efficiency layer. >>: Do you mean [inaudible] financial sector. >> Tim Mattson: Not the financial sector because a lot of that looks like HPC. I'm talking about business process, sales resource management, point of sale management and you know, another area I know as little about as I can get away with. >>: [inaudible] and that type of stuff. You know, their model is right many times on [inaudible] so they do really want productivity [inaudible]. Same thing with [inaudible] always changing [inaudible]. >>: In the financial sector? >>: Well, in the [inaudible] analysis. >> Tim Mattson: So at any rate, finishing out this overview of the Berkeley folks you have groups working on applications, you have groups work on patterns, you have groups working on the productivity layer, you have folks working the efficiency layer, you have a group I just want to mention it briefly because I'm really interested that what they're doing is rethinking the role of the operating system and saying like well what if I expose the scheduler as a first class item that the efficiency layer people can manage? Because you know, most of the time the schedulers bury down on the OS and I have very little way to interact with that. So it's very, very interesting how they're blurring this line here between operating system and the efficiency layer and it's one of these areas where I'm not participating in the research but I'm watching it very, very closely because it's fascinating to me. And some of the grad students working on this are graduating soon. So if you're looking to recruit some really interesting people that's a good place to look. Then they have an architecture group. So they're really top to bottom is what they're doing here in the ParLab. And then we haven't heard much from them, but the whole idea is off to the side, they're the people that are going to worry about correctness, but I'm an old HPC person, I don't care about correctness. So [laughter]. It doesn't have to be correct, it just has to be fast. So what we're doing in this work I'm now describing is work I'm doing with Professor Curt Quatzer [phonetic] and I don't know if many of you know him, but he's very well known in the CAD community and spent like I don't know, 15, 20 years in the CAD business before he became a professor. But what we're doing is we realized that the problem we're trying to solve is bigger than just parallel algorithms which is really what my book focused on, is just how do people do parallel algorithms? What we really have to do if we're going to address the ParLab mission is we have to figure out the whole software architecture angle, how that looks from a parallel point of view. And so our influences are, you know, the traditional patterns community and my book influencing in one direction, so a very algorithm focused thought of patterns, the Berkeley folks broke down computing into these 13 dwarves which political correctness doesn't let them call them dwarves anymore, so I think they're calling them motifs, but basically the 13 patterns are common computational elements that seem to appear again and again across domains. And then we looked at Garland and Shaw, who kind of had that classic book that summarized software architecture to high level. And then finally the influence that really came in strong was the very original work by Christopher Alexander on pattern languages. And this is where it all started back in the '70s. And Christopher Alexander, I don't know if you folks -- it's actually a beautiful book to read and it's actually very interesting to read. Because what he was trying to capture is he talks about that quality within a name. When you see something really, really special. Something inside of you can tell that it's special. And wouldn't it be nice if you could get experts together who understand that specialness and if they could write it down and share it. So it's very much that almost religious zeal for patterns and pattern languages. So you pull all of these together, and you end up with our pattern language. Okay, clearly that's a working title. But some day we'll give it a better name. But OPL stands for our pattern language. And in a very long talk, you know, I've given patterns talks many times and they're really hard to make them interesting, because they get down into a lot of detail, and they quickly get kind of boring. So the thing to do is just look at it at a high level and then come to me and I'll give you more detail later, I'll give you things to read. But the idea is that we're trying the to capture a language for the architecture of parallel application software. We start at the top here, in fact let me advance to the next, we start at the top here with patterns that talk about the large scale architecture of your application. So pipe and filter, map reduce, layered systems. These are concepts if you know Garland and Shaw and have been reading software architecture academically that will be very familiar to you. Then over here we have the computational patterns. This is the 13 dwarves, you know, the idea is that if you're familiar with the 13 dwarves it started with Phil Koala who recognized and suggested that all of scientific computing was an instance of seven archetypes that he called the seven dwarves. There's things like dense linear algebra, sparse linear algebra, unstructured, structured mesh, et cetera. And what the Berkeley folks did is they sat down and they took that as their starting point and said, well, you know, if we add six more, we think we can cover everything. So they came -- that's where the 13 dwarves came in. And then they would invite people in from different application domains to sit down with them and go through the dwarves to try and do a -- you know, see if they had coverage. So they would bring game developers in, they brought the folks I work with in the workloads group at Intel in to look at the RMS workloads, they brought in financial people. So they really spent some time validating to themselves at least that these 13 dwarves do a pretty good job of covering everything. And you know, it's not cast in stone. In fact, we're very interested if someone looks at the list and says, gee, you need to add two or three more. The list will grow. The key thing is that it's a number on the order of 13, it's not a number on the order of 10,000 or hundreds. It's a manageable number. And these are the computational patterns. So the way I like to think of it is when I'm at the architectural level, if I'm walking up to a white board and I'm describing my application to you, at the architectural level I'm drawing boxes and arcs between the boxes. Then I'm ready for the computational level, now I'm saying what's happening in each box. And so therefore it's an iterative nature, so I may as I start looking at what's happening in each box say gee, those two boxes really should be merged. Or maybe I should split this box apart another way. So we have these arrows here, these green arrows pointing out that it's kind of a back and forth between the architectural level and the computational level until you've described the architecture of your application in terms of the mainly boxes, the connections between them and the types of computations occurring in each box. >>: Have you alluded to the notion of composition and how that is [inaudible]. >> Tim Mattson: Yes. >>: Future parallel programming? Would you say that that's mostly a concern of your left box, the architecture box? >> Tim Mattson: Is composition mostly a concern of the architecture box? I don't think so. But I'm not sure. I think it can't be just a concern of that box. And the reason I think it's hierarchical in composition's going to occur at every level. I mean what I'm talking about for example, if I have a box and inside the box is a spectral method, that's one of these patterns here, well, the spectral method I'm doing a transform, I'm doing some computation then a bat transform. So now I'm composing inside the spectral method several different routines. Well, now, when you start looking at doing higher dimensional transforms there's computation that would occur inside there. So it's all hierarchical and composition I think will influence every lawyer. It would be nice if we could just say I can only worry about it here. And I think often I might get away with only worrying about at the architectural level, but to ultimately solve the problem composition will touch every layer of the patterns. Did you have a question or comment? >>: I was wondering [inaudible] between the architecture and the computation. Do you think those are fundamentally different problems that need different solutions? Because that's what it looks like on this picture. Everything else is kind of layered on top of each other [inaudible] you have. >> Tim Mattson: Side by side. >>: [inaudible]. >> Tim Mattson: So we have those two side by side and it's kind of artificial because we're laying things out in a clean stack diagram because we have to get it on to a two dimensional representation. And depending on the problem, things may stack differently. But I do think you're talking about very different things in these two cases, and the solution I think will look very different. Because here I'm not saying anything at all about what is computed when I write the architecture. All I'm concerned about is what are the major blocks and how are they connected? Whereas over here it's all about what is computed. So I suspect the architectural patterns will be pretty straightforward for us to come up with some high level frameworks or a wizard to build them. It's easy to think of some tools I can build that will support this architectural level. >>: I [inaudible] I would say you [inaudible] more research done in the past on the computation side because that's where all the HPC work is. On the architecture side I think it's this graph diagram [inaudible] techniques. >> Tim Mattson: Okay. I'll agree with you that from a research point of view in the parallel world there's been very little in the architecture side. But I just when I conceptually sit back and think about it -- I mean, the fact that I'm talking about it as I'm drawing boxes on a white board, well it's pretty easy for a wizard to look at that box's representation. I mean, heck, look at visual basic. You know, I mean imagine a visual basic like front end where I just sort of pull down little icons and connect them together and bam, it creates my framework for the high level architecture. >>: [inaudible] let's figure out how is data being shared between those boxes. Of. >> Tim Mattson: It's very difficult to take an existing application, reverse engineer it into this. So -- and that's a process where we're going through now is validating this design, building case studies by looking at existing applications. So but I respond to that on two bases, in two directions. I gave it my response in looking at existing applications and we're finding that it is very difficult. And it's difficult because they compose together in complex ways. And it's also difficult but it's also difficult because people writing software don't think architecturally. We do a very bad job, especially in HPC. We've done a very bad job at thinking about a software architecture. In HPC and that's where I come from, we would use a monolithic architecture, which means no architecture, we just start writing code and merging loops and so I would say that part of what I'm looking at with this is people writing new code and trying to influence people to write new code differently and that for writing new code I would like them to think of an architecture as they moved here. If I only solved the new code problem, I'd be successful. >>: But I think even for new code the question remains how do you, what are the right abstractions to build your architecture? >> Tim Mattson: That is the question. And we think we've got -- we think we've got them. Or we think we've got a great start at it. What we're doing right now is testing that theory. Now, let me emphasize for those of you who haven't been in the patterns community long, which is probably all of you or most of you, in the patterns world we describe the process as pattern mining. So pattern is not something you sit down and go I think this is how it should be. Patterns are always mined from excellent solutions. So if you look at the things here and you start reading them and you go gee data parallelism that's not new, pipeline, come on, people have been doing that for ages, yeah, okay, good. Good. If you look at this and you see stuff that's totally new, then we failed because it should capture excellence in existing solutions. And so that's what we consistently do. We go out and we look at excellent solutions and we mine them and we understand the patterns and we either validate that we have the right patterns in the right places or we validate we need to add new patterns. And this is the process where we need to grow the community and get more people in. Because with my book, there were the three of us, and we came up with something very nice for the stuff the three of us had seen many, many times, which is useful but it's not enough. What's enough is to get a large community together and really boil down around a consensus of what are these basic building blocks. Yes? >>: [inaudible] solution [inaudible]. >> Tim Mattson: How do you know a solution is excellent? You know it when you see it. Yes. [laughter]. That is scientific. Yeah. A community -- a consensus in a community sees it as excellent. But, yeah, that is subjective. I mean, we're capturing the human element here. >>: For example like when you were saying that you didn't care about correctness in the HPC world you care about performance, an excellent solution would be one that [inaudible] but perhaps [inaudible] this is bad. >> Tim Mattson: That's right. >>: Not good. >> Tim Mattson: That's right. You are absolutely correct in that excellence is in the eye of the beholder and therefore there is a subjective factor. This is very subjective. When you start talking about design patterns you're stepping over from the world of the typical computer scientist where everything is clean and objective and very, very systematic to fuzzy, things are vague, maybe you can take it this way, maybe you can take it that way. But you know, that's critical because what am I trying to do here, I'm trying to capture that human angle that we've traditionally done such a terrible job of capturing. So I'm trying to get it and write it down. But I need to keep going because we're almost -- well, we're almost out of time and you're almost sick of listening to me, I'm sure. So this is just an example and oh, my. I learned long ago if you oversee a typo on a slide you have to fix it on the spot or you'll never fix it. Okay. So this is a one slide example. And I'm not going to walk through it just because of the lack of time. But looking at the CAD universe, you have an overall pipe and filter architecture, so that's what you see with the boxes here. And inside the boxes you have these computational patterns that appear. And that this one picture actually summarizes a huge amount of what happens in the CAD space. And as an example now of how we've boiled down the problem into a form that we can go through and start systematically analyzing and working through instead of just this very confusing mess that is the CAD applications. So the patterns give us a way to precisely describe the computation which is valuable for the experts sitting around talking about how do I support this in the software world and how do I build an infrastructure to support it, and it's valuable to the person new to this field who's trying to learn it. But it also gives us a way to compare and contrast. Are remember that old programmability thing I started with? Now, remember earlier I could -- I just said, OpenMP is great, Windows thread suck or OpenMP is awful and data parallel programming is great. Well, now I can actually go through and I can look at different patterns and I can say, look, how well does this notation support programming in that pattern? Now, it's still a little subjective, but now I've laid my assumptions out on the table for us all to sit around and argue about. So four or five of us sitting around with experience with each one of these could discuss and argue it and trade off until we came to a consensus on how these things played off. And so you can see if I'm doing SPMD style programming, gee, everything works great. Okay. But if I'm going to do, you know, an actor's type architecture, well, gee, MPI is great for that, MPI can do anything, right? MPI is great for that, OpenCL would be absolutely abysmal and OpenMP maybe I could do it but it really doesn't support it well. So I give it a red box, even though I know OpenMP very well, I know how to trick OpenMP into doing accurate but I'm tricking it to do it, so it's not fair to say it can do it. So this gives me a way, when I have a collection of programming models to compare, to lay them out against a set of patterns that I care about and now I've productively said something about these languages rather than I prefer OpenMP and everything else sucks. So I think it's a huge step in the right direction. And let's see. Gosh, I thought I had -- because we're running low on time. I can't believe it. I lost it. It was a great slide, too. This is what happens when you pull slides together at the last minute is that slide hidden. Oh, it's down here. Okay. I'm sorry, folks. So let me just go on. All right. Now, I want to emphasize that to us at the ParLab the patterns are a means to an end, they're not an end in and of themselves. So I think they're valuable and they're valuable if we're talking about programmability. But the other thing that we really like the patterns for is they give us a roadmap for the frameworks that will support the productivity layer. So in our vision of where this will go, you know, the three years from now where we're near the tail end of the five year ParLab period, you know, we're hoping that we get to a place where you have this small number of hard core technology programmers that produce these parallel programming frameworks. And I'll use the word framework very loosely. It can be a language supported by a set of libraries supported by a wizard. I mean, it's software technology to support parallel programming. And then you'll have a domain literate computer scientist who understands kind of the bridge between the hard layer efficiency and what a domain needs and they're going to work at the parallel programming frameworks to create these application frameworks. And then my end user programs who know nothing about parallel computing and our goal, I don't think it's possible but the Berkeley people do is that that domain expert double even know they're writing parallel code. That domain expert just has his parallel application framework that's been given to them and the parallelism is buried inside of there. I don't believe that's possible, but I believe it's a good goal to shoot for, and I believe if we only get 90 percent of the way there, so they may only say a little bit about the parallelism and all the details can be buried, then we've been very successful. But I'm hoping that the design pattern languages we're talking about and we're working on will be the roadmap for building this hierarchy of frameworks. And that's the end goal in the ParLab for the patterns. And I'll skip that. And so the status of where we are with that language is we're quite confident about the overall structure. It will need a little tweaking and moving around. And we may need to change some definitions. But that picture I showed you with the pattern language we feel pretty good that that's pretty close to right. The top level patterns we have good drafts for most of those. The bottom level patterns, some of them come straight out of my book and therefore we have them, some of them still need to be written. But this term I'm teaching a course at UC Berkeley where we're basically writing the rest of that pattern language. We have the descriptions, we know what they are, but if any of you have ever written a pattern, writing a pattern down is a very involved and difficult process. It's very fulfilling actually. I mean I find it very satisfying because you have to do a lot of research to make sure you really have found that nugget that characterizes the good solution. And finding it and understanding it and writing it down is like writing a good tight piece of poetry. So what we'd like to do is really get more people working with us from a broader community, we want to grow the pool of people that are working with us on this language. Because that's the only way we'll be a consensus for how this field works. So we have at Berkeley these all day pattern work shop. The next one's coming up March 18th. Then we have one on April 22nd. So you're all welcome to join. I would love to have the problem of turning people away because we have too many. That would be great problem. And then we're having a conference, paraplot, which we'll call for papers will be going out soon. It will be in Santa Cruz 4th and 5th. Which is after the ParLab summer retreat. So we're actively growing the community to wrap this thing up. So we're running out of time here, so let me just mention briefly these other two and I'll skip a lot of slides but these slides I can make them available to anyone who wants to see them. But this is another one is having human centered model of how programmers solve parallel programming problems. That's very important. That's what patterns are all about. But it's not enough. We need to have a language of programmability. Now, this came to me as I was studying the psychology of programming. And if you ever want an interesting read to go out and search out the literature in the psychology of programming, there's not a lot of it, which I think is tragic and I think is a huge mistake. Because every known programmer is a human being. So you would think that anybody who wants to understand how to make programmers effective would want to understand how human beings go about programming. But there is remarkably little research happening on that front. After years of working on the cognitive psychology of programming, one of the founders, one of the real leaders in that community, Thomas Green, came up with the idea that you know coming up with a nice, neat compact cognitive model of human reasoning and programming wasn't getting us anywhere. What we really need is just a way to talk about how notations convey information. And he came up with this thing called the cognitive dimensions. And they're a way of laying out and talking about the tradeoffs you make in choosing an information notation and how they address the different problems in using that information notation. And in describing something. Now, one of the famous papers about using cognitive the dimensions comes from a person I've never met, but I've read his paper many times, Steven Clark of a company I think you've heard of quality Microsoft, but he wrote a paper or he used cognitive dimensions to analyze C# back in the early days when C# was I think still being developed. And so I mean it's a very -- it's been used with C#, it's been used for the design of remote controls for televisions. I mean, information notations can mean just about anything. Now, I've got a lot of ideas, but let -- a lot of material here that I won't go through, but let me just mention one of the cognitive dimensions. Probably my favorite one and I think it says a lot about information notations. This is the idea that viscosity is an information notation. How easy is it to introduce small changes to an existing parallel program is how it applied here? So look at OpenMP. I have this loop that I'm parallelizing and don't worry if you don't know OpenMP but I'm just saying do this loop in parallel and do a reduction into sum and you better make the variable X private or you're going to have a race condition. Okay. If I wanted to change the schedule because this is a semantically neutral clause, except for possibility of introducing races, but it's mostly semantically neutral, so I can just change, if I want to change the schedule, just by changing the schedule clause. So it's a very low viscosity notation. Whereas if I did the same thing -- I'll get to you a second -- whereas if I did the same thing in the WIN 32 API and explicit thread API, I'm not picking on WIN'S Windows, but now I have to -- the schedule is something I'm hard wiring in there. If I want to change from a static to dynamic schedule, I have to make all sorts of changes. So this is a -- a more viscus notation. Now, notice very closely, I didn't say one was bad and one was good. You know, there's things I can do with this high viscosity notation that I can't do with that light viscosity notation, but it now gives me a way to talk about them, and talk about the tradeoffs between these two styles. And to make a productive dialogue about that. So now ->>: What is one -- you call it semantically neutral. >> Tim Mattson: Yeah, I know. They're relatively semantically neutral, yes. >>: Does it matter what data [inaudible]. >> Tim Mattson: Yeah, then you put a reduction in there and then you have the different orders and how you do the sums. >>: I think the important piece is to understand -- to make a distinction between thinking about the abstractions being used and making them into the language using some [inaudible] so say if I look at the success of [inaudible] patterns they will, good thing about them is none of them require you to change your language, you can just use [inaudible] languages that are already there and you can implement them all. >> Tim Mattson: Right. >>: No new syntax, nothing needed like that. So I think what happens in the parallel world, people haven't quite advanced to that level, they are still revising the language themselves. I mean, it should be -- we should be talking about abstractions that can be done on top of the -- of some base foundation that doesn't need to change every time we think of a new ->> Tim Mattson: So I -- I'm going to agree with you, though maybe not fully. But I think my pattern language is an attempt to write down that set of abstractions. And as we iterate through it and once we come up with the right pattern language where we get enough input from a broad enough array of people, that will give us the set of abstractions and then we can start to talk about what the language is. These cognitive dimensions are all about syntax. And so it's almost like I have a complete radical change of subject. Because now I'm not talk abstractions anymore, not much, I'm really talking about syntax. But I can't tell you enough to the boring application programmer like me syntax matters. I've had computer scientists tell me time and time again syntax doesn't matter just get the semantics right. Screw it. No. Syntax matters. See this apple on this computer? It's their because you changed the APIs on Visual Studio and broke all my programs. I've never gone back to Visual Studio since. Syntax matters. I'm still pissed about that. You know. So this cognitive dimensions is all about describing syntax. And because I know you're way past your threshold. I want to get to this the nice picture. So there's a whole range of these cognitive dimensions. And you know, you can look them up, read the Steven Clark paper if you Google cognitive -- I'm sorry, you don't use Google here at Microsoft, you use -- what's the Microsoft? Live Search. Okay. Live search on cognitive dimensions. And you'll find there's a whole list of them. And once again, it lets you -- I'm going to show you this one here. It lets you productively compare across programming languages. And I can say that, yes, OpenMP has a very low viscosity, MPI has a very high viscosity. I'm not saying good or bad, I'm just comparing them. Okay? And I can go through OpenMP because of that stupid shared address space is very error prone. If you've written significant OpenMP codes you've written some ugly race conditions and hopefully you have found them. Whereas MPI actually, you know, MPI is actually not that error prone. It's why we like it. It's pretty easy to get my program correct. I may have to do an awful lot of work to decompose the data structures and you know, the point is this gets us past one is good, the other bad. And it gets us to a point where we're talking about the features of a language that make it good, the features of a language that make it bad. So last thing I want to say, and then I'm going to shut up, is we have metrics of performance. Why don't we have metrics of programmability? We need standard programmability benchmarks that we can use and we argue about. And my poster child for this is the [inaudible] problems. I think some of you here have heard of them. >>: [inaudible]. >> Tim Mattson: Yeah. I love this. And but it's old and dusty, and I think we need to update it and expand it and get a broader community. And what I'm thinking of is what I'd like to happen is the Berkeley folks did a really good job of coming up with these 13 dwarves. So what I would like to do is get a group of people together to define the 13 exemplars. So we come up with a programmability benchmark for each one of these. But in my mind, a programmability benchmark to be effective has to be defined, paper and pencil like the [inaudible] problems, but there has to be a reference implementation that is short and easy to read because this is the key. If I have to read a 10,000 line program to compare one programming notation to another, I can't even start. But if I can compare a 200 to 500 line program and that means we're kind of moving into the danger of the toy code area, but I think we have to be able to have these standard comparisons that we can make between programming notation or we'll get nowhere. So that's all I had. Thank you. So just closing, what I really hope happens is we can start working together and we could, you know, think about it. We have this structure at a Berkeley, it's a great place to come to work together where the lawyers won't bug us. But you know, I want us to work together on a pattern language because we need as broad a community as possible to make it work. I think we'd all benefit if we built a discussion framework of programmability, the cognitive dimensions are a great starting point. And then finally, I really would love to see us come together and work on these standard programmability benchmarks. And I think if we don't do this, chances are we'll just add to that list of programming languages and we won't necessarily create anything that anyone will use. So thank you very much. [applause]. >>: Any other questions to Tim? We're going to lunch after this just in the cafeteria. And I think what's the afternoon special? >>: We have a [inaudible] room 2209. And we [inaudible] about patterns. >>: And that's from like one to two or something. >>: [inaudible]. >>: Well, I'll ask one question because I was going to ask you a question on the metrics [inaudible] and the slide you did show was a little different than what I was thinking as far as metrics. So as you distill down, right, and you take a higher level concept into a lower level layer, right, how do you judge whether or not your choice to that lower level is a good one to [inaudible] I mean how do you calculate sort of the error, the inefficiency, you know, how much you've lost or how much more you can write you know as you break down. Because you go to the lower level, I mean, [inaudible] choices right of how you implement more of the higher level things. And so how do you decide whether you made the right choice, the wrong choice, what the ->> Tim Mattson: Right. So how do you decide as you move through the layers of the pattern language that you've made a right choice or a wrong choice. Yeah. So there's two things to point out in your question that are important to appreciate. And one is that there's no single path. And that's the issue. How do you know you have the right path? I submit that it's iterative. You go down all the way through to the lower layer, now you have a stack of paper with boxes and diagrams and little notations and you get to the bottom and you go, gee, that was, you know, is there a better way to do it, and you have to iterate back through. It's not an objective, you know, that you can assign a numerical weight to each branch along a tree and you can say, a, yeah, this is the lowest cost path. We're talking about the human process. It's fundamentally iterative. >>: So you have no metrics at the higher level to judge the branching until you get to the bottom level and then you just have performance and scaleability? >> Tim Mattson: Yeah, you get to the bottom -- well, usually because we're talking about software design, so hopefully you get to this before you start having code running and performance metrics. But you know the whole idea is you get down to a design cycle is it exposing much concurrency. And if you look at my book, we talk a great deal about that iterative nature of how much concurrency are you exposing, is this reasonable relative to the target platforms, you know, we like to talk about being architecturally neutral, but fundamentally at some point you have to break that architectural neutrality and think that gee, I'm going to be on a Windows platform with a laptop with a 16 core processor. But it is iterative. And it's fascinating to me if you do look at the psychology programming that iterative nature comes out in study after study after study that we teach people in school a top down method of software architecture. But the fact is that no good professional programmer programs that way. They always -- the really good productive ones, they always sort of iterate down a little bit, then jump back up, then iterate down again and jump back up. And therefore a notation and a framework that supports them has to support that iteration. >>: All right. [applause]