>> Yan Xu: So the first session will be on... keynote speech will be from Dennis Gannon from Microsoft Research...

advertisement
>> Yan Xu: So the first session will be on transient computing and our
keynote speech will be from Dennis Gannon from Microsoft Research on
Cloud Computing and Scientific Data Analytics Challenges. Thank you.
>> Dennis Gannon: Okay. Can you hear me all right? Does this thing
work? Okay. So actually I changed the title after I gave it. I was
going to talk about Cloud Computing and the Long Tail of Science. And
the theme of this relates to some extent, or I hope to a larger extent,
to the discussion that we just had in the last year. So, hey, this
works. Okay.
So I've got to talk about very quickly what are the challenges of long
tail science, what is long tail science in our view. But I really want
to get at in this talk is this question: Is there a sustainable
financial model for scientific data? And that, I think, relates very
closely to the discussion we had. And I'll talk about our data centers
that are powering commercial clouds; we've got a lot of them. And I'll
talk about what we've been doing with the Azure cloud in research. Data
architectures and the role of MapReduce: I deleted those slides because
I thought this was an hour talk and actually it's a half-hour, so we'll
skip that. And mostly you probably know that anyway. And talk a little
bit about managing analytics from your desktop and building communities
which brings me back to the main question.
So I think you may have heard of this notion of the Fourth Paradigm and
the revolution of science, so I don't have to spend too much time on
this discussion. But as you may have noticed, the data explosion is
transforming science, not just astronomy but every science out there.
Everybody now is a data scientist. This is something that has happened
very quickly and it has had a profound impact on many disciplines. And
one of the things that all disciplines have told us that they need is
the technology for publishing data and sharing data in the cloud, to be
able to do analytics especially on a much more massive scale than they
have in the past. Another somewhat pejorative name for the scientists
in what we call the long tail are the spreadsheet scientists. Right?
Spreadsheets are the predominant analytical tool used across
disciplines in science. The astronomy community is well beyond that but
I'm sure some of you still do some spreadsheet work now and again. So
they need to move beyond that. They've got a desktop machine and
laptop, and it's got so much storage and so much computing capability.
And it no longer fits with what they need to do. And there are a number
of other issues that have come up that have changed the landscape for
how they have to deal with science, so I'll talk about the sustainable
economic model.
So what would I mean by long tail? So I put in sort of the graph here
is the size of data collections. And high energy physics is clearly up
there. And I put astronomy there too, but I've had astronomers argue
with me that they do not consider themselves yet in the same data
league as high energy physics. Well, you folks could tell me whether
that's right or wrong. Genomics is now, as we've seen, really working
itself there. But what I think about is that collective long tail of
science that's just of economic data, social science data, behavioral
science data, political science data. A variety of disciplines that you
see around your universities that you never in the past have seen
really very much of in the computing center. And one of the things that
changed things is, as you know, the National Science Foundation now
basically requires all the data that goes into publications to be made
public. And the universities are currently struggling with this right
now. This data has to be preserved. The data must be sharable and
searchable and analyzable. Now you talk to some scientists and you say,
you ask them, "Well, how is your data --" Well, first of all this is
largely from the NSS point of view an unfunded mandate, this idea of...
>> : [Inaudible] mandate. I'm sorry, you can't get away with saying
that. We do not require all data be made public. We can debate that at
length, but we do not require all data to be made public.
>> Dennis Gannon: All right. So it's not required to be made public,
although, people still have to come up with a data management plan and
the fact is the people in UK I believe is now -- much of the data is
now required to be made public. I just was in Brussels last week talk
to folks there and so it is happening. At some point we have to deal
with it. And it is an unfunded mandate to deal with it at this...
>> : No, it isn't.
>> Dennis Gannon: Well, I -- Talk to the researchers that I talk to
[inaudible]...
>> : Yes, but [inaudible].
>> Dennis Gannon: Well, okay. Perhaps they don't understand.
>> : It's a debate topic.
>> Dennis Gannon: All right. Well, we could debate it. But how many
people out here among the astronomy community feel that the data they
have is -- there's a pressure to make more of it public from the
government? Does anybody sense that?
>> : Well, certainly...
>> : Pressure is not a requirement.
>> Dennis Gannon: I see.
>> : There is pressure.
>> : Well [inaudible]...
>> : There are certain specific requirement in specific fields. There
is not [inaudible]...
>> : Right. Okay.
>> : But those people who get funding from...
>> : I'm sorry, guys, but I was involved in writing this policy, and it
irritates me that you keep misunderstanding it.
>> Dennis Gannon: Well, okay, so you don't -- There may be...
>> : Let me [inaudible].
>> Dennis Gannon: ...good reason to make data public that is not a
mandate. We'll leave it at that.
>> : If you want more funding, [inaudible].
>> Dennis Gannon: I'm sorry?
>> : If you want more funding the future, making your data public is
certainly in your interest.
>> Dennis Gannon: And there are also good scientific reasons for doing
that as we've discussed before. But then there's this issue of
sustainability, you know. How can we create an economic model for this
long tail of science? First of all the government will not directly
support an exponentially growing data collection. I talk to some
scientists and I say, "Well, you know, you want to make your data
available for everybody for a long period of time, how are you going to
do that? Who's going to pay for it?" And they all say, "Well, my data
is national treasure. It'll be kept forever and the government will
support it." Well, that's sort of like declaring yourself to be a
national park, and that just doesn't quite work. So what we'd like to
do, and this is a hypothesis we have that we've worked on, that we can
create, it's possible to create, and ecosystem that supports a
Microsoft -- A Microsoft? How about that? -- a market place of research
tools and domain expertise. Right?
The idea is that we will build critical mass data collections in the
cloud. And this relates to the discussion we just had that there's a
lot of scientific tasks that other can do better. That was the point
that was made earlier, this notion of making research public which I
thought was really intriguing. But there are certain tasks in a
research activity that perhaps there's someone out there who is much
better at doing that, and that person or that group can provide the
expert services to the community to do that. So if we had data
collections that were stored some place that had to be paid for then,
in addition, have expert services available that would simply the
research life of the people that need those services.
For example, I remember talking to someone doing computational fluid
dynamics around certain solid bodies, and one of the biggest challenges
in doing that is building a really high quality mesh of that body. And
that is a real important skill to be able to do that. Well, there are
people that are extremely good at it and there are people that have
even set up services that will do that for other folks. And actually
NSF funded that to get that going. So the idea is that if you provide
something that is high enough quality that people will be willing to
pay a small subscription fee -- You'd get the basic data for free if
you wanted it because it's there and it's a community collection. But
if you want to be able to access services at a higher level, something
that is truly useful that accelerates your research because why not let
a professional do it or let a group that is particularly good at that,
to carry out that part of your research for you then they can also -that would give you mechanism since it's almost contractual or through
some arrangement, they would agree to cite you in their publications.
That issue came up earlier. And so I believe that this is a way that we
could build a subscription-based service for a lot of scientific data
analysis.
So I think Michigan's Inter-university Consortium for Political and
Social Research, the ICPSR, is a very good model. They've been doing
this for ten years, providing the social science data community with
highly curated, very well developed data collections as well as
analysis services. And they've been doing that for at least a decade,
I'm not sure how long. And that is completely supported by
subscriptions. Most universities, big universities I believe, subscribe
to this for any of their researchers to use. So this is a really
interesting model. Why can't we do that?
Anybody familiar with that service that they do? If you go on their -Yeah, if you go on their website, you'll be impressed by the amount of
stuff that they offer. And for the researcher, since the university is
subscribing to it, feels like it's free. So our hypothesis is that we
can do this, as a community we can do this.
And so we're going to try to test that hypothesis. So now I'm going to
get away from that for a minute and just talk about the data centers
and our cloud platform. We've been building data centers for a while at
Microsoft. Microsoft and Google and Amazon have really a massive
collection of really very large data centers. These data centers are
spread. We've got two in Europe. We've got two big ones for our Windows
Azure cloud platform, two in Europe, two in the U.S., two in Asia and
more to come. And in addition to the Azure, we've got a bunch of other
data centers. And these things range -- Our small data centers are
maybe a hundred thousand servers to a million in the big ones. So these
are really big facilities. Whoops, there's one more. And now here's one
that is -- To give you an idea of where the technology is going, this
one -- Let me back up here -- in the upper right corner we call a
fourth generation data center. And the interesting thing about this is
this data center, and this is an artist sketch obviously, there is no
building to it. Okay? We've got so good a building these data centers
and putting them in large big nasty buildings the size of mini football
fields that we've realized that, "Well, you know, actually the way we
build the containers for the servers we don't need the building
anymore." We simply pull these massive containers up, plug them into a
backbone of power and networking, and now you've got a data center.
To give you another idea, this is based on these data centers, they
cost half a billion dollars to build. But they're based now on these
model of shipping containers. Each shipping container -- Or something
the size of shipping containers. We've gotten away from shipping
containers. But they're the size of a shipping container, forty feet
long and they have as many as 2,500 servers each in one container. So
it's completely packaged. It's got the network in it. It's got a power
outlet. There's basically three plugs on one of these things; there is
a power plug, a network plug and maybe a cooling plug which you pump
water in. Now the water in the newest ones is literally the size of a
garden hose. It's not a big fancy powerful water system because these
things here, like this guy -- This is the latest generation -primarily it is cooled with ambient air and that's why it can be left
outdoors even in places where it gets extremely hot.
If it gets really hot, we turn on the garden hose a little bit. It
helps cool it down. So these things are driving down the cost of energy
consumption. And the simplicity of setting up the data center, that,
you know, you don't even go into this thing very often. If servers
fail, you don't take it down to repair it. You wait until a whole bunch
of them fail and then you can either take it offline and go in to try
to repair it or you just ship in a new container full of servers, send
that one back to the factory. So this is the technology that's used
right now to build the large data centers.
Now the properties then of these data centers: they are really
providing, you know, information services to many users simultaneously.
The deployment of services is automatically managed on virtual
machines. How many of you are familiar with -- Most scientists are
familiar with Amazon, I'm sure. Amazon computes their -- Yeah, okay, so
people know about deploying things with virtual machines. And it's the
same thing in the Windows case except we have a higher level of
services, but now we also have something that's identical to Amazon's.
These things provide automatic fault recovery for failed resources. The
data replication is already built into this. On our cloud if you store,
you know, a gigabyte of data in the server that gigabyte is replicated
at least three times and maybe as many as seven times. And it can be
geographically replicated or even across international borders, but
that's your choice. Yeah?
>> : So replication is for...
>> Dennis Gannon: Reliability.
>> : Okay, so that implies backup or --?
>> Dennis Gannon: Basically this is the backup. Your data is replicated
so that if the server fails there is at least two more copies of that
data and a third copy will be automatically brought up based on the
other two.
>> : There was an instance of Amazon loosing data.
>> Dennis Gannon: Well, I'm talking about Microsoft.
[ Audience and speaker laughing ]
>> Dennis Gannon: Yeah, actually that was an interesting case because,
in that case, when Amazon lost services it was really the customer had
failed to select geo-replication because their whole data center went
out. And so the customers that had geo-replicated their data weren't
even offline, you know, their services kept running. The customers that
were in trouble were the ones that said, "Oh, yeah it's in the data
center. It'll never go down." You know, power is lost in whole regions.
You know? Tornado wipes out something.
>> : But the question is how much more expensive is geo-replication?
>> Dennis Gannon: For the user it's no cost, the replication, it's just
a choice. Some people don't like geo especially across international
borders. That gets to be a sensitive issue. And also, those of you
understand about parallel computing, this is designed to support two
levels of parallelism in this architecture. One level is giving
service, say Netflix or something with thousands of concurrent users,
and then within that you have -- Or a better example is Bing or -What's that other one? -- Google, you know, it is something that is -You have a lot of concurrency going on in the construction of a search,
the reply to a search, lots of parallelism there and then parallelism
across the user base, many, many concurrent users. So it's really -Multiply those together and you see these things are designed to
support truly massive parallelism.
Now, though, it is not the same as a supercomputer; this is an
important point. The scale of the data centers is really quite large
but the thing that's different about the data center and most
supercomputers lies in the network. Those of you who are familiar with
supercomputers -- And this is an old slide because it mentions the old
Blue Waters design with forty thousand 8-core servers and Road Runner,
thirteen-thousand cell processors and our Chicago data center is a
hundred thousand 8-core servers at the time I made this slide. It's
bigger now. So they are very big, but the network architecture is
different. In a supercomputer you spend an enormous amount of money on
the interconnection network between the processor, so your computation
if you're doing parallel computing, you have really good what's called
bisection bandwidth, the data that travels between processors can move
at very, very high speeds and with very high bandwidth.
In a standard data center, the network is completely different. It's
designed to be outward focused so that thousands of concurrent users
are coming in and going out. And they're coming in over the Internet.
And so the network inside the data center is like an internet. It is
based on internet protocols, standard IP protocols. And that means,
those of you who are familiar with internet protocols, they're very
complicated, multi-level routing and exchanges. It is not the same as
you would see in a supercomputer which is very, very lean communication
protocol stack.
Now this is also changing. Amazon was the first to recognize, you know,
"Some people want to do some supercomputing-like tasks inside the data
center." So they put up a really nice sort of a supercomputing cluster
within the data center, and Microsoft has just announced that we're
going to do the same thing and so they'll be such a cluster in Europe,
another one in the U.S., another one in China. And so that'll give
people who need, really need, massively parallel supercomputer-style
stuff, things using MPI that type of communication with low latency,
yeah, we'll have that available. And, of course, you'll pay more for it
because it costs more.
So what has been my experience working with scientists in the cloud so
far? And what I've been doing for the last two and a half years is
running a program which we work with various funding agencies, the NSF,
the European commission, the National Institute of Informatics in
Japan, the agencies in Asia, in China and Taiwan, about making our
cloud available to researchers so that the researchers could try
experiments using the cloud.
And it's been interesting. So far I've got about 90 projects scattered
around the world that are up and running on the cloud. Of those 90
about, I would guess, a third of those so about 30 have really done
something very, very interesting with it. To give you some ideas of the
types of things people have been doing: Like, for example, we did a big
project with the University of Washington on protein folding where they
had a tool that was based on a technology like [inaudible] which they
allowed people to use, you know, volunteer computing to do protein
folding. And they had a specific problem in there was a pretty long
wait time to get anything done. And so we had someone from the Baker
Lab, who set up such a system, come to us with a specific challenge
that they were working on having to do with something involving I
believe it was a Salmonella -- yeah, a Salmonella virus inject DNA -Yeah, it was a really important protein fold problem having to do with
the study of Salmonella. And so we got them two thousand concurrent
cores, put their system up on it so our cloud became the volunteer, and
they were able to handle that computation and get a couple of a Nature
and a Science publication out of that in about a week or so. So that
was quiet successful.
We've got another project in France with INRIA there where they're
using a thousand cores to compare FMRI scans of brains together with
the associated genetic information from the patient to try to
understand how the brain anomalies might be exhibited within the
genomic information as particular types of mutations or faults in the
genome that would be associated with these brain anomalies. And that
project is still going on.
Fire Risk in Greece, another one from Europe. Greece, a couple of years
ago, had some really horrible fires. And they built a service on Azure
to be able to take data from around the country to stream it in, to do
the analysis and the prediction and the modeling of where the fires
might break out. And this then became -- Because it's on the cloud, it
became a web service that could be used from a laptop or an iPad or
something that could be in the field. You know, the first responders
could actually be querying this and looking where the fire hazards are
and what the situation is.
>> : Does this include SQL Azure as well?
>> Dennis Gannon: Pardon?
>> : Does this include...
>> Dennis Gannon: Ah, yes. Some of these things are SQL Azure related.
These three cases here did not use SQL Azure. But SQL Azure, for those
of you who don't know we have in addition to the Azure Cloud and the
mass data storage that we have there's also a mass deployment of SQL
servers. And that is something that is used by some of the projects but
not all. In fact that didn't become available until many of these
projects already got started. Drug Discovery: Newcastle in the UK is
using Azure to model properties of various molecules for Drug
Discovery, and that project has been quite successful and it's even got
a commercial spin off going. Paul Watson has been doing that.
In Japan we had somebody looking at structural analysis of predicateargument in the Japanese language. So they looked a whole host of text,
and they do a lot of machine learning-based techniques again on Azure.
They used ten thousand cores on Azure to do this analysis. And finally
another example is this is done in the U.S. with the University of
Virginia and also South Carolina where they are looking at large scale
watershed modeling. And that project is also still ongoing and quite
interesting.
Now, what have we learned from this? You know, traditional
communication-intensive MPI applications belong on supercomputers. We
had a few people try to do traditional, say, quantum chemistry modeling
that they were doing on a supercomputer -- This is an Australian group
-- and do it on our cloud, and they said, "This was horrible." And I
said, "Well, I told it would've been but you wanted to try. Thanks for
trying." So aside from that the cloud had some really wonderful
advantages. First of all it is an environment that encourages sharing
and through-web access. Excuse me. I'll make it through the half-hour.
A lot of this is massive Map Reduce data analytics on cloud resident
data, and that works very well. Or massive ensemble computations, you
have a thousand things that you're trying to run simultaneously. A
thousand cases, okay, that's a typical ensemble calculation. And you
can do them all in parallel and that doesn't require lots of
communication. It works very well. Also this model of scale-out as
needed works well. The users found that they like this pay-as-you-go
model for building over buying a cluster and then having to devote the
life of a graduate student or two to maintaining it. Pay-as-you-go
model works pretty well. In particular, you buy a cluster with, say, a
thousand cores. You want to then somehow use something with ten
thousand cores; well, your thousand-core cluster doesn't expand.
Whereas, the cloud can scale-out if you need it to be. And this turned
out -- You know, I did a detailed survey of the users that used it and
of all the successful projects they said, "Yes, we would go back and
ask our funding agency for money to use the cloud rather than go out
and buy specialized hardware."
And I see I'm almost out of time. Should I go to one thirty? Is that
the time?
>> : Yeah. A few minutes.
>> Dennis Gannon: Few minutes. Okay, I'm almost done. But what do
people do here? They want to bring large scale data analytics to more
people, they want scientists to be scientists. Okay? You know, most
scientists don't want to deal with system administration. They don't
want to learn to operate a supercomputer; they want to focus on their
science. They use standard tools like spreadsheets and statistical
packages, desktop visualization. What they would like is to be able to
push a button and have the big hard parts go off into the cloud, and
they want to be able to share results with their collaborators. So what
we see as a software design stack is to design a collection of data
management analysis tools that open, extensible and provides, you know,
this economic sustainability model, is accessible from a desktop,
encourages collaboration and leverages this capability of these public
clouds that we've got. So we're working on trying to figure out can we
build such a thing to do that. That's an interesting -- Oh, let's skip
this.
Oh, yeah, so here's an example of the types of technologies. We're
really interested in not basically selling you our latest piece of
software but finding out what are the tools the scientific community
wants to use. And we'll put them there on the cloud. So here's an
example. We now have some really great support for Python, and I know
that's widely used in the scientific community, as is R. And so there's
this thing that has been built by a couple of groups in the Python
community called Python Notebook. So this is sort of a web thing that
looks kind of like a Mathematica Notebook which has got all sorts of
cool things. It's a way of sort of communicating. Here's my new way of
doing some computation, and here it is. It's an executable notebook.
The notebook is actually running in the cloud, and it can invoke cloud
resources. It can even do the back end parallel computation.
Another thing that we've worked on is something called Excel DataScope.
And what it is, is an extension to your spreadsheet that allows you to,
from your spreadsheet, you get this ribbon that you see in Excel. Well,
we have a special ribbon for scientific data analysis for importing and
exporting data, doing things like outlier detection machine learning, a
variety of different algorithms. And we've released some of this open
source and we're going to be releasing more of it later.
So next steps? We've been working very closely with the Internet2
community. And we got started lately with this notion of large scale
project with -- Thirteen university CIO's came to us and they said,
"Help us. We've got this challenge now. Our people in the past we just
gave them a workstation and they were fine. Now they need data
collections and they want to be able to store the data, share the data,
they want to be able to mix data from different sources." And they came
to us. The CIO's said, "What can we do?" And so we had a group of us
that met in March, and we started up something. We made an agreement
with the Internet2 to be a provider of cloud resource -- or broker, I
should say, of cloud resource to the university communities. And so
we've sort of focused on two different things. One is -- I should back
up again -- genomics research. We will do a program around genomics
research. And you've heard some of this stuff from Microsoft and the
great interest in that here, and we'll continue that. But in terms of
the long tail, a workshop we've got coming up on October 15 and 16 at
the University of Washington on this sort of cyber infrastructure for
social science. And so there we're going to be looking at trying to
find out can we build a community that is interested in sharing data,
in bringing together important data collections and the important tools
that people need to study the data. And so, you know, our goal is to
try to demonstrate that we can build two sustainable collections or
collections for two communities within three years and at the end of
three years to see if we've got something now that has enough services
and capability that people would be willing it self-sustained.
So if you're interested in this, talk to me. And I'll stop there.
[ Audience applause ]
>> Yan Xu: Thank you very much. And we have time for a couple of
questions, please.
>> Dennis Gannon: Yeah?
>> : So on the slide that you had about the lessons.
>> Dennis Gannon: Yeah?
>> : So I don't think that any of those seemed particularly surprising
necessarily. I was wondering is there anything that actually struck -that you didn't know at the time that you learned from this?
>> Dennis Gannon: Oops. Well, I just killed that. Well, I won't go back
to it. Well, you know, in hindsight, yes, those things are not very
surprising. It wasn't clear to me, though, you know, how many users
would -- They're willing to experiment with this cloud as long as I've
given it to them -- how many users would, say, afterwards when it's all
done say, you know, "Yeah, I would actually ask my funding agency to
provide me with, instead of money to buy a cluster, provide me time on
Amazon or Azure." And that wasn't in the slide. It was in my more
recent analysis that I've done. And that was quite strong, both in the
U.S. and in Europe.
>> : Your average astronomer, I think, is only just now coming to terms
with data access through the cloud [inaudible]. And I think most
astronomers really don't want to do any analysis in the cloud. Many of
the online tools and [inaudible] very similar, to produce results for
their publications they do that on the computer with code that they've
written themselves or collaborated. And so I like the Python Notebook
idea but can you imagine other ways to bring the traditional laptop
environment where astronomers are comfortable with their own codes into
the cloud? Is that a possibility?
>> Dennis Gannon: I think so. I mean, it depends upon the -- You're
talking about taking the applications that people run on their laptop,
pushing them to the cloud and scale them out. Yeah, absolutely....
>> : [Inaudible] applications up there rather than providing them with
a fixed set of applications....
>> Dennis Gannon: Right. So I absolutely agree that you need that
capability. We have this thing that we built through the European
project that's called the Generic Worker. And it takes some application
from your desktop and through a control panel you can push it out and
have one or more instances of this thing running in the cloud. Now it
becomes very complicated depending upon if that application has a
special user interface, you know, graphical interface or whatever. That
can be dealt with but it's typically for each one. But, yeah, I think
that's important. That's one of the reasons I am really pleased that we
are now able to run Linux VM's on the Azure Cloud.
>> : It's kind of a comment on the question, but the [inaudible] Camp
Fire, that's what Campfire does. You go on a VM and it's like being on
your own desktop. And it's a cloud. So that's [inaudible].
>> Dennis Gannon: I should go look at that later. Yeah.
>> : I mean this is [inaudible] need to get over this idea that they
can't use other people's tools. [Inaudible].
>> Yan Xu: Yeah. Okay, briefly, last [inaudible].
>> : So sort of following out of this last question, what about the
case where you've got software that requires licenses? Specifically
MATLAB? And, you know, if I'm running a big problem with, you know,
your ten thousand cores and don't tell me an R or Python, okay.
>> Dennis Gannon: You want MATLAB.
>> : I've gone down that path. And so the question is, well, what do I
do? [Inaudible]...
>> Dennis Gannon: Yeah, so actually in the case of MATLAB we've had
discussions for several years now with MathWorks. And they are, I would
say, warming up quite nicely to the idea of providing -- I mean this is
a business decision on their part. And I think they're kind of getting
it now. I don't know what will come of that. I hope it is something
that happens sooner rather than later. But I'm not involved in that
discussion. You and I can talk about that but, yeah. Yeah....
>> : We should talk. Yeah, because I've got a problem I could, you
know, run tomorrow if that...
>> Dennis Gannon: It has been a discussion. I mean the first time...
>> : Okay.
>> Dennis Gannon: ...we went to MathWorks about this they said, "No
way. Get away from us, you devil." And then over the years they've sort
of kind of figured it out. But I don't know how far along it is.
>> Yan Xu: Okay. Thank you very much, again.
[ Audience applause ]
>> Yan Xu: So we move to the second keynote of this session. And Mark
Stalzer from Cal Tech will tell us about Trends in Scientific Discovery
Engines.
>> Mark Stalzer: All right. Thank you. Can you hear me? Okay, super. So
I'm actually going to talk about one of Dennis' slides but in much more
detail. And that's supercomputers. And so the reason that
supercomputing is important for two: there are some applications that
you can only do on a tightly coupled machine. All right? The other
thing is, is that supercomputers act as kind of the Formula One race
cars of the computing industry and they drive the progress of the
entire industry.
So to those of you who saw parts of this talk last year, I've made
changes. So I know it's after lunch but, you know, please stay with me
here. I've already done that. And so some of the things about
computing, and these have been true for quite some time, is that
supercomputers are always out of commercial parts. And this is even
with the Cray-1, the commercial parts they used were just very simple
gates. And some of the drivers -- It used to be actually the reason we
have a semi-conductor industry in the United States early was for
missile guidance and now I think it's all for virtual missile guidance
which is video games. But it's all about power and packaging. When
you're going to try and get the most performance out of a computer, you
have to be very careful about your power and very careful about how you
package it. So this is a definite -- Well, we see this is the cloud
computing too, but they have different constraints they're trying to
optimize. And the fact is, is a hundred megawatts is very expensive.
All right?
But there are devices that are extremely power efficient like the chips
that drive the iPhone or the iPad. And these computers are hard to
program but they can be easy to use if you have the right abstractions.
And so this talk is kind of broken into two pieces: one, we'll talk
about the pure high performance computing in terms of how quickly you
can do linear algebra, and there's some very interesting things going
on there. And then I'm going to try to make the case that we're all
messed up on our storage architectures and we need to rethink those.
And the way to do this is to come up with a metric-like -- like the
LINPACK benchmark for storage.
Okay, so trends in what I call the simulation engines or computing
without data because they compute much faster than they can pull data
in. And so the top ten supercomputers: This is just the June one. This
is this year is Sequoia. And it's running at 16 petaflops which is
quite remarkable. And there's another nice machine in Japan that's
running at 10 petaflops. But you'll note that these machines draw
nearly 10 megawatts of power when they're running. The other
interesting thing is these are not accelerated. You have to go down to
-- Oh, boy -- This machine is actually accelerated. So people say that,
you know, GPU's are all the future, yet the fastest computers we have
right now are actual general purpose. This is what Sequoia looks like
and you can actually play soccer in this machine room. It's big enough
to do that -- This is at Lawrence Livermore -- if there weren't
computers in it. And it's very cold. And so one of my points will be
that all the parallelism, or a lot of it, is moving onto the socket. So
people ask, okay, is it a processor or is it a core, anything like
that? The thing to think about is that it's a socket is where a lot of
the computing is happening. So there's the socket.
And of course these things are optimized for dot products. And that's
what one looks like in C++ and there's a lot going on here, about ten
instructions. And what's interesting about this point in time is that
we can compare the fastest machine in 2002 with the fastest machine in
2012 so over ten years. They're the same architecture. So we can
compare apples and apples. And so ASCI White got 7 teraflops. It's the
Power architecture. It's clocked at 375 megahertz. It has 8,000 sockets
and it could complete one of these loops every clock cycle. It was
finishing like ten instructions every clock cycle. And so we'll just
define that to be one. All right?
In Sequoia, it's over 2,000 times faster than ASCI White but it's the
same basic processor architecture. It's running about four times
faster. They went up to a hundred thousand sockets so there are
reliability issues beyond that. And this is a tightly coupled machine
with the very fast networks. But what happens at the socket level is
the parallelism is now jumped to 64. And in fact of this gain about
half of it is clock rate and just making the machine bigger and better
packaging. The other half is the parallelism on the socket. Okay. And
this has a lot of implications, but it's going to get much worse. So
you look at the Intel MIC.
It will have fifty cores on a socket and each will be four-way
threaded, so it'll go over a few hundred threads per socket and a
teraflop. And in a few years this is going to scale out to a thousand
threads, and hopefully it'll be cache coherent. And we'll have
something equivalent to ASCI White in a single socket. All right? This
is a qualitatively different programming model. You use MPI between the
sockets. It now becomes very high latency, and you have to use massive
general threading within the socket. And we're going to have to rewrite
our codes. We're going to have to rewrite out codes to exploit this
two-level parallelism. So if you think of an image processing algorithm
and you had a bunch of computers, you could just throw an image against
each socket and it would, you know, do the comparison from, you know,
what we just got from tonight to a month ago. But you can't do that
here. You can't throw hundreds of images against a socket because they
just don't have the memory. So what you have to do is you have to
rewrite the code so that it actually parallelizes in the image, okay,
in little regions in the image. And so this is qualitatively different
model. And now we're asking our students and our post docs and whatever
to now learn two kinds of programming models, and it's just inevitable
because the physics is driving it. And what you get is what I'm calling
a Socket Archipelago. It's just this big sea of sockets.
It can get much worse than that. We were looking in another program at
exascale computing and every one of these is another layer of
parallelism. So I've only talked about two here but those are the most
relevant, you know, the ones that work right in this level. So, you
know, we'll get to an exascale but, you know, it'll work for maybe one
code.
And this is a chart from Peter Kogge. And what was really interesting,
some people have talked about you have to move the data and everything
like that. In these supercomputers, you do not want to move the data
because this is a complete trace as a part of LINPACK of one operation,
one floating point operation and what it takes to stage it all up in
terms of power. And it was 475 picojoules. The amount of energy it
takes to do the actual floating point operation is 10 picojoules, so
it's a factor of 50 just to move the data as opposed to compute on the
data. So this is kind of another manifestation of what I'm saying.
All right, so let's switch to data engines. And here progress, it's
very much behind what's been going on with the traditional
supercomputers. They had a very well defined easy benchmark. Storage is
actually quite a bit behind. And here you get this latency of how long
it takes to access data. If it's in the register, it's one cycle. If it
gets down to disk, it's, what, ten million. And remote memory, which is
memory in somebody else's socket, is ten thousand cycles and so there's
a big gap. And so people have been trying to bridge this with using
flash-based solid state disk. And they've had some good success.
So this is Gordon. And you can look at the architecture. But it's
basically a supercomputer but with some solid state drive attached
storage nodes. And it has -- What am I looking for? Okay, anyways, it
has about 8 terabytes of solid state disk. So you can actually load an
8-terabyte data set on to this machine. And it has typical
supercomputing kinds of interconnects. And so this is a particular
benchmark. I actually forget what this is. The nodes here are actually
the size of data structure not the size of the computer. That'd be kind
of big. And so you look at the performance with hard disk drives -This is very new data. It came out in June -- and when they ran it --
So it computes the same but when they ran it across the hard disk
array, and this is a supercomputing hard disk array, it was this amount
of time. And then when they moved down to the solid state array, there
was a dramatic improvement by a factor of six and a half.
You also note that we're actually starting to get an Amdahl's Law
effect here because we're actually computing a lot more than we're
hitting the disk which is a good thing. Anyways this is available right
now on XSEDE. This the TeraGrid successor. And so another idea that
Alex Szalay and others came up with [inaudible] was this idea of an
Amdahl-Balanced Blade. So there's three definitions here. And the
Amdahl number is a bit of sequential I/O per second per instruction per
second. And simulation codes tend to only need about 10 to the minus 5
which means they're doing very little I/O. They're just completely
computing. But they built some machines, again, using solid state discs
that were much -- in fact they used the Grey Wulf as the standard
because it was a very effective data intensive machine and it had some
good balance. But moving over to SSD, they were able to get every
better Amdahl balance. And their hypothesis is that you, for data
intensive apps, need a number of about one on the Amdahl number. And
they're getting that with some of these machines. And they're also
remarkably low power, 30 watts, and they're pretty cheap. And they
built a whole cluster out of it which is what they call Cyberbricks.
And they get this whole 36-node Amdahl balance cluster in a little over
a thousand watts. This is like a hair dryer. This is nothing. And
spectacular I/O performance. And so if you have to crunch an
astronomical data set this is a nice machine, of course, to do it on.
This is going more mainstream and Calxeda -- I haven't ever heard it
pronounced so I hope I pronounced it right -- working with HP on a
program they called the Moonshot. And these are very tiny and they're
ARM processors so ARM is in a lot of cell phones and things like that.
And these cards are only like this big. And they can put like 200 and - Does it say here? No. It's over 200 -- close to 300 of these chips in
4U case. And they all boot Linux, so talk about a system administration
nightmare. And each of the little virtual servers -- They're building
another card that has a solid state disk -- would only draw about 5
watts of power. And so you can imagine, you know, Yahoo and Google and
Microsoft Cloud servers moving to technologies like this which saves a
tremendous amount in terms operating costs. All right. How much time do
I have?
>> : About five minutes.
>> Mark Stalzer: Five minutes. All right. So anyways, you can go even
further than this. Yes?
>> : You said that you may expect a big data cloud to use this
technology...
>> Mark Stalzer: It could.
>> : But is that -- Do you see any show-stoppers for that? Or is that
actually going to happen?
>> Mark Stalzer: You would have to ask the people that run it, but it
seems that given that they're running a bunch of commodity servers, you
know, these servers are going to keep shrinking too. And their work
loads are designed to work well on those kinds of things. And so I
would imagine it would...
>> : But cost-wise are they competitive?
>> Mark Stalzer: The power is much lower. So I'm running out of time
here and I want to get to the two punch lines on the whole thing. So we
could do a lot better, though, for data intensive applications. In fact
we can do a hundred times better than what we're doing now with
existing technology. And that is by using parts that the companies like
Apple are putting into things like iPads. And what you can do is you
can array these parts up and you can, on a single blade -- All right? - you can get like 64 of them. They're small. They're cell phone parts.
But when you aggregate it all up, you get about sixth [inaudible] flash
on a blade; you get about six terabytes. But in terms of the
performance, in terms of bandwidth and latency, it's about a hundred
times faster. And then you can get up to about a four teraflops
accelerator, so again this is very Amdahl-balanced. And it also fits on
a single blade. It would look something like this. It just got
published. And it has huge implications for data to discovery because
it can read its entire data set a hundred times faster just because it
has so many I/O channels and it doesn't have software stacks on the I/O
channels. It's a hundred times faster than random access. It's Amdahlbalanced. And so this is qualitatively new because it's factor two
orders of magnitude. You need to have a number of reads much greater
than the number of writes. This is for a very technical reason. But you
can imagine one rack of these things, just about half-petabyte,
handling all of the LSST processing. Okay? And it'd be good for this as
well. In fact I looked at one server, and one server could store a
billion web pages and handle Google's basic search workload. Now they
do a lot more but this is factor of hundredths big. It's performance on
triple stores -- And, again, I'm going to run out of time because I
want to get to something else -- is probably we think about a thousand
times quicker. And maybe this is starting to look at what a storage
metric would be because this is unstructured, you know, semantic data.
And maybe these are the kinds of metrics we should be looking at. You
know, things like triple store performance to get the data field to
where the simulation people are at.
So this is fun part of the talk. So there's some speculation. The human
mind only does 10 to the 16th Ops. And the whole point of these next
two slides is to show how far we have to go in our trends in computing
for both storage and computation ability. So I'm just stipulating this.
You don't have to believe it, and it's not my number so it's not my
fault if it's wrong. All right? So we could build these now. Okay?
Sequoia's already running faster than this and you can build FlashBlade
like engines this fast too, probably at about two megawatts. Okay?
That's the important point, two megawatts.
Oh, for all of you people who write simulation codes, an engine like
this would check point in ten seconds which is just extraordinary. But
now there's another big data engine, all right. This is actually a
monkey brain. So by definition we're at 10 to the 16th. The memory
actually in a storage-intensive system as I've described would be
actually quite a bit larger. And so this would only be about 10% and it
forgets. It forgets all the time. The bandwidth is roughly the same; it
just depends on where you are on the hierarchy. But the packaging is
rather remarkable in that it's really small. And you saw how big
Sequoia was in terms of the number of cubic feet of space, and it's a
factor of 8,000. All right, so these numbers are kind of like
plausible. Here's the amazing number: this thing only draws 25 watts.
All right? So it's 80,000 times more efficient than our current
technologies even if we do every trick we can with our current
technologies. So in terms of trends, we got a long ways to climb. All
right? And then if somebody knows the algorithm at Microsoft Research
here or something, please let us know.
So, all right, back to the socket archipelago. And I'm almost done. So,
again, let's look at where the parallelism is at. So at the cluster
level I don't think we can really go beyond 100,000 sockets for
reliability issues and size issues and things like that. And the
latency between the sockets is about 1,000. A thousand what? In a
socket we might go up to about 10,000 threads. This is just in five
years. So what we have is a latency difference of 1,000. So this goes
back to this whole idea of MPI's basically working between all the
islands, okay. And this is where the archipelago idea comes from. And
then the threads are, you know, the tribes on the islands. And so they
can communicate much, much faster here than, you know, rowing a boat
over to the next island. So, again, we're going to have to restructure
our codes. And there has to be large caches on the sockets because the
second you go off socket, that's it. All right? You have tremendous
latency. And I'll claim that the non-volatile storage, whatever
technology it is, is going to have to be stacked on top of the
structures as well otherwise they take forever to get to.
I don't have time for this. How much time do I have? Okay. So anyways,
you can look at the slides later. There's some ideas on programming
these things. But my concluding remarks, last slide, we're not stuck
with clusters. Off-the-shelf technology, you know, it's not what you
just get at Fry's but the things that we know how to do and we can
quickly build.
What is a top 500, like, benchmark for data? Because this will drive
the development of the systems. Okay? It's been fantastically
successful for dot product engines. We want something similar for data
engines. Get used to threaded programming or find somebody who can
write a library to abstract away what you're doing. And also think of
what can be done in terms of a shrinking 10 to the 16th Ops system
because that's where the technology ultimately going. And that's it.
>> Yan Xu: Thank you very much.
[ Audience applause ]
>> Yan Xu: And time for a few questions? Yes, Matthew?
>> : I have two questions or two comments. Firstly, on the benchmark
for data a result that I show Wednesday of some experiments I've been
running comparing [inaudible] against relational databases and proving
unstructured data.
>> Mark Stalzer: Okay.
>> : It turns out that the [inaudible] in relational databases...
>> Mark Stalzer: I believe that.
>> : ...in just off-the-shelf. On the programming side, so this would
mean advocating both, you know, MPI parallelism and GPU-type
parallelism [inaudible] as the sort of thing we should be teaching.
>> Mark Stalzer: Right.
>> : Is essentially what you're saying.
>> Mark Stalzer: Actually MPI, yes. But I would say programming with
general threads. GPU's are very SIMD machines. And...
>> : Well, [inaudible] is...
>> Mark Stalzer: Yeah, yeah.
>> : ...a bracket term for that sort of...
>> Mark Stalzer: Right.
>> : ...threaded. Yeah.
>> Mark Stalzer: We have to teach threaded programming.
>> : So this is lightweight memory usage but for small computational
algorithms.
>> Mark Stalzer: Yeah, exactly. And how to synchronize them and what
10,000 are flying around at once. Yeah.
>> Yan Xu: Please.
>> : So you were talking about the [inaudible] reads much larger than
writes.
>> Mark Stalzer: Yes.
>> : And then you followed it up by talking about checkpointing.
>> Mark Stalzer: Yeah.
>> : It's exactly the opposite.
>> Mark Stalzer: No, no. So the problem is, is that flash memories are
programmed by tunneling through their oxide. Right? And eventually you
break it down. So there's some technical things in that it looks like
for a given flash part if you relax non-volatility constraints to like
a week instead of ten years, you can get well over a million writes.
And so the point is, is that a machine like I was describing would have
to like go to sleep for, you know, every few days so that you could
swap out flash parts. So if a programmer tries to use it as just a
read-write system, they're going to burn it out and you should charge
them for that. So you want to just think in terms of reading the data a
lot and only writing it. But a checkpoint every hour is no big deal.
>> : Even though you're never reading those checkpoints?
>> : Right. Right, exactly.
>> Mark Stalzer: Correct. Right.
>> : But we can also repair outside [inaudible].
>> Mark Stalzer: Yeah.
>> : [Inaudible]. They just have to [inaudible] every night.
>> Mark Stalzer: Oh, okay. That's okay. I'll just go to sleep.
>> Yan Xu: Yes? Final question.
>> : The 10 to the 16th Ops 25 [inaudible]...
>> Mark Stalzer: Yeah.
>> : ...machine is a nice piece of technology but you cannot trust it.
[ Audience laughing and commenting simultaneously ]
>> : Well, no, you just use the [inaudible].
>> Mark Stalzer: You can't do it with CMOS, okay. But graphene-based
structures will get down there. And they're actually -- I'm not a
device physicist but they're actually even more reliable than CMOS
parts. But you still can't trust it, I agree.
>> Yan Xu: Okay. Thank you very much, again.
[ Audience applause ]
>> Yan Xu: And we'll move to the panel discussion, so please the
speakers for the panel discussion.
>> Alexander Szalay: So we need to -- In order to get to sort of
exascale and exabytes of data we need to get on a completely different
curve. So I think Mark gave a really wonderful start to this. But we'll
see that most likely we will build systems with millions of components
so with very high-density and low power. And then there will be all
sorts of interesting issues coming from the change in the programming
paradigm. Not just about how do we handle the threading but also how do
we handle the frequent failure of the components and how do we write
codes which are also self-healing or recover from [inaudible]. How do
we create two kits where we can basically simulate hardware failure or
anomaly at will into the different pieces of the code so that we can
actually see how to debug the code essentially recovering from all the
errors.
And on the power side if we are, for example, focusing entirely on the
data intensive part -- So, for example, if you want to build a very
heavy [inaudible] engine that really streams data at Amdahl number of
one, we might consider, for example, cache to be an impediment. So
today a lot of the power -- So when Intel builds processors they make
all sorts of tradeoffs between how much silicone and power do they
devote in the chip to the floating point units to the I/O devices and
also how much area state is spent on cache. And basically cache makes a
lot of sense when we do numerical computing when data locality gives us
a lot of advantages. When you want to stream a petabyte of data from
the disk through the storage hierarchy onto the CPU, essentially the
cache is giving us very little. So in a sense if we could build a
special processor, stream processors basically, this is to some extent
why the GPU's are so good in doing what they do because they have
almost no silicone wasted on the I/O. And basically they also have a
very efficient streaming so we can build basically all these pipelines.
So just to get started.
>> Ian Foster: Okay, I'll say just a few words. A topic that I think is
worthy of consideration is what will the computer systems that the
community needs to build or acquire or pay for look like in the future?
We've heard that they need to be able to accumulate very large amounts
of data, perform analysis on that data, presumably -- We haven't heard
so much about this but also integrate simulation with data analysis for
various reasons. So how many of those systems should we have? Will we
be able to perform all of our computation on systems operated by the
likes of Microsoft or will -- I don't think it necessarily makes sense
for every university to acquire such a system. Will national centers
acquire them and, if so, will they look like our supercomputers today?
I think they probably won't. So those are some questions that perhaps
people have opinions on.
>> Mark Stalzer: Well the advantage of just speaking...
>> Ian Foster: Yes.
>> Mark Stalzer: ...is that you know what I think. I think, again,
power is a crucial issue. Workforce development is very important for
all sorts of things, from threaded programming all the way up to, you
know, how to use data analysis tools. And those are -- And also
finding, like I said, what's the metric to drive data intensive
systems? Because we could do a lot better than what we're doing. And
flash technologies are actually relatively primitive. There's other
emerging technologies that could do a lot better.
>> Yan Xu: Are there any [inaudible] comments
>> : So all of you made essentially an important point that we've been
leveraging for scientific computing commercial developments driven by
something else, say GPU's or cell phone technology and so on. I'm
thinking that likely we're going to see a lot more 3D video coming both
real and simulated and that that might produce the next generation of
GPU equivalency, if you will, or [inaudible]. Is anybody thinking what
might be the architecture of those and how can those be turned into
scientific computing engines? Or speculate how would you do it?
>> Alexander Szalay: Already I think the current generation of GPU's
can render more than the data that we can feed it. So they are already
-- So in a sense there is some race going on but essentially the bottle
neck is already how do we get the data and divert model into the GPU's
basically. So already a tablet can render a very complex game. And so I
think the trick will be rather that, okay, so how do we get -- for
example, in terms of visualization how do we get the data again to the
engine? How do we -- Maybe we will be trading more and more CPU because
CPU will be essentially "free." So using much fancier compression
techniques, basically trade more CPU against bandwidthsthat I think
that might happen. So we come up with extremely clever compression
algorithms.
>> Mark Stalzer: My guess is that they'll look a lot like better
versions of our current GPU's. And the trouble is that these things are
difficult to program, so that's why there's a shift back to a model
that is actually in some sense more energy inefficient but they can get
more performance out of it. But if you really want to just build a
holodeck, the final drivers are probably going to be, you know, GPUlike structures like what we have now.
>> Yan Xu: Yes.
>> : About power consumption, you said CPU's are essentially free
equips [inaudible] talked about [inaudible]. And one of our biggest
problems there is that we got this machine [inaudible] which requires
10 megawatts, and that's not cheap, dominating our running costs. If
CPU's are free but they're [inaudible]. So do you -- And I know the
power costs are slowly coming down, but do you see any bigger leaps in
the future? And, otherwise, CPU's won't be free.
>> Alexander Szalay: So what Andrew Chan, who was the head of Intel
Research down in Chicago, he has a wonderful talk about this. So the
ten by ten. So basically the current generation of Intel CPU's is like
Swiss army knife a thousand times over. So it has an instruction set
that is way, way too complex. And basically any co-processor, so an
FPGA or a signal processor chip or a codec hardware chip so a DSP chip,
each of them can do a hundred times better in a very specific task at
the same power budget. Okay, so why not build an array of those, a
mosaic of those on the same silicone where we turn all the different
components on and off at will instead of trying to [inaudible]
everything in a general purpose hardware? I think that's a very good
perspective. Of course it goes kind of [inaudible] Intel's bird view so
no wonder that...
>> Ian Foster: He left.
>> Alexander Szalay: ...left.
>> Ian Foster: And there are a lot of people who have a lot of ideas
for reducing power, running at a much lower power and accepting higher
error rates. I think it's not clear which of those will work out given
the challenging economics of scaling any of these up to mass
production.
>> : So you said that we needed a different sort of way of teaching
people how to program. And at the eScience meeting in, the one in North
Carolina, Michael [Inaudible] made an interesting comment that
surprised me but all the computer scientists in the room kind of nodded
their head. He said that the improvements with all the Moore's Law
parallels and parallelism that we forget what he was doing about
improvements in software. And he said we're better off using 1980's
hardware with today's codes than the reverse. So the question is in
order to take advantage of this, what are the sort of software problems
that need to be solved in order to really take advantage of these kinds
of hardware improvements? Like I have some problems that, you know, are
not solves by Moore's Law and parallelism.
>> : Right.
>> : But eventually I think solved by some clever person figuring out
how to do it better. So what should we start to be teaching ourselves
and our students?
>> Mark Stalzer: Well, you know, an N-log-N algorithm beats an Nsquared algorithm any day. And so it's important for people to study,
you know, the best known algorithms. And there are numerous examples of
this. Okay? My point I was making is if you want to exploit the
capabilities of the chip you have massive data sets or you're trying to
model global climate change or something like that then the students,
in addition to knowing algorithms, need to know more about parallel
programming, thread-based programming. The computer science departments
know all of this, and they know how to teach it but I mean it's not
common for scientists to actually take these computer science courses.
So it's there, it's just another aspect of education that needs to -It's one more thing for the students to do. So --.
>> : So it's well known but only to those who know it well? Is that --?
>> Ian Foster: I'd like to comment. Can I make a comment? Right, so the
advances that Michael was talking about were algorithms not software.
Those are different things of course. So I'm not sure that that many
computer science departments teach parallel programming classes.
>> Mark Stalzer: We do.
>> Ian Foster: Some do but not...
>> Mark Stalzer: Yeah.
>> Ian Foster: ...I think it's not quite as common as you might think.
So it's...
>> : [Inaudible]...
>> Ian Foster: Yeah.
>> : Yeah.
>> Ian Foster: So I wanted to observe that as I understand at a place
like --Google I'm not so familiar with -- Microsoft, there are
thousands of people writing very large scale parallel programs but they
don't know about multi-threading or MPI. They do it using libraries
that are being developed that meet the particular needs of their
applications. And I think you mentioned the importance of libraries in
your talk. But in a sense of your students are writing multithreaded
code, you've failed in some -- not the professor but the community has
failed in some way to build the right infrastructure.
>> Mark Stalzer: I mean an example to that is OpenGL. And, you know, it
renders all these beautiful images and it can be highly parallel on
GPU's. But the people who use it don't have to know the parallel
programming.
>> : Is that a question of algorithms or language? If we recode
[inaudible] we would certainly get all this for free.
>> Ian Foster: Well I mean once efficient -- I mean algorithms and
language are different. And certainly writing multithreaded codes using
a [inaudible] library is not a recipe for happiness. And doing it using
[inaudible] is probably far more effective but [inaudible] in itself
doesn't make your algorithms N-log-N instead of N-squared.
>> : No but if you end it with people having a better natural
understanding of how multithreaded things work and they will
intuitively make up their own more naturally threaded algorithms.
>> : Something different. You and also Dennis have been talking
essentially about supercomputer equivalence for data-driven things
whether it's machines like Grey Wulf, whether it's the giant data
centers. And they are optimized essentially for web search. And there
is the cloud which is a few of the centers. But don't you think that
what we should be going to is a [inaudible] hierarchy cloud that let's
clouds, the whole dang climate. And the architectures that need to be
optimized for different things like, you know, certain types of data
mining. And I'm picturing not just the data center itself but down to
the blades or maybe down to the processor load instead of one catch-all
thing.
>> Mark Stalzer: You're still going to have some basic things that you
can assemble the machines out of. I mean, that's my only --.
>> Alexander Szalay: But I would like to throw in another --. So I was
in [Inaudible] at a data intensive workshop relating to the grid form.
And there was an I/O [inaudible] workshop. So there was a lot of
discussion essentially about the fast systems. And so people are
obsessed in POSIX fast system since scaling up to petabytes and so on.
And when you think about the -- So basically in the underlying fast
system we have a very complex set of hierarchies where we know exactly
where every piece of data is located by different granularities. And
then we hide everything, and POSIX is basically a simple data stream.
And then after that we build yet another tier of data structures to
again rebuild to figure out where the data is actually located on the
physical storage. It just really doesn't make sense when we get to very
large amounts of data. And so there is this recent trend of exposing a
lot of these details in the fast systems including the new version of
Windows. So there is a lot of object-level access basically to data
items. But this was really -- It also started to kind of realize that
what this Hadoop [inaudible] using these words. So basically it's an
ultra-simple scheduler that's a very, very dumb I/O scheduler. It's
very -- People have spent 20-30 years to write schedules which work on
supercomputers and schedule basically CPU computations. But it is not
very easy to co-schedule, basically, different types of I/O operations,
random access versus sequential; they mess up each other and interact.
And when you have only sequential scans on the Hadoop-like systems and
only a single linear [inaudible] of the whole data, it's very simple to
deal with. It's easy to predict the behavior. And basically at the same
time in database systems, people have spent 30-40 years, again,
optimizing the complex I/O happening inside the database engine. And so
in this word I think there is a convergence kind of, I think, slowly
emerging that somewhere there will be some fast system which will merge
the best properties of both the traditional fast systems and basically
the database localities and object granularities.
>> Ian Foster: But it seems to me that, you know, we -- as you put it
the web-searching optimized systems would assume that data sits in one
place and then computation is performed on it. But, I mean, inevitably
it seems that there are some storage devices that are cheap but slow
and some that are fast but expensive. And so data is going to have to
move between these different source of systems. So you may well have
heterogeneous systems that are optimized [inaudible]...
>> Alexander Szalay: And [inaudible] could even be [inaudible] of those
things.
>> Yan Xu: Any more questions or comments?
>> : Again, something different. So moving data is where most of the
trouble is, right? Power, [inaudible]. And so is it poor thinking in
terms changing our algorithms to essentially be data mining streams in
real time and never see the same data again? Which will be a very
different approach from having stationary archive and talking at it
from all different directions.
>> Alexander Szalay: [Inaudible].
>> : Well, [inaudible].
>> Alexander Szalay: [Inaudible].
>> : Well, but all that is throwing away data on an instrument level.
What I'm thinking is it's a science-grade level.
>> Mark Stalzer: It may get to the point where it's actually cheaper to
re-compute things than to pay the cost of moving the data.
>> : Well, for simulations but...
>> Mark Stalzer: Yeah.
>> : ...for the real-life measurements [inaudible]...
>> Mark Stalzer: Right. Of course.
>> : ...gone, right?
>> Mark Stalzer: Right.
>> : I think good data can outlast a lot of bad computations.
>> Mark Stalzer: It does often, right.
>> : We're not able to afford to analyze the data again and again. You
can only do it once.
>> : We're going to do both. I mean we're already doing both. I mean
you take the data stream. You extract something from it in a pipeline
style and you put it away in case somebody else has a clever idea
later. Yes.
>> : What I'm worried about if we may not be able to afford that second
step with the exponential growth as we see it.
>> : Well, we're already -- I mean, George, you certainly know this.
We're already at that stage not so much yet for astronomy but Earth
science and to some extent planetary science...
>> : We're there in astronomy now.
>> : We're there in astronomy.
>> : I was thinking certainly by the time of, say, [inaudible] we could
be at that stage.
>> : That's right.
>> : Well, I mean LOFAR is already generating more data than LSST.
>> : Yeah, but that's...
>> Alexander Szalay: [Inaudible]. There's a lot of [inaudible]
simulations that we run our supercomputer. We are already there because
we store many fewer snapshots that what would be ideal for our science.
>> : And the climate models too.
>> : So with [inaudible] it will take about 12 hours to read 12 hours
of observed data off the disk. You never, ever want to do that.
>> Mark Stalzer: Right.
>> : And so [inaudible] off the disk. In which case you don't store it
obviously. And so we're planning and [inaudible]. If one day you do get
this better algorithm, you're going to need a hell of a computer. This
disk actually is [inaudible].
>> : Three minutes. Oh I have the perfect question for three minutes.
[ Audience laughter ]
>> : Anybody dares to speculate about quantum computing?
[ Multiple inaudible audience responses ]
>> Ian Foster: Maybe yes, maybe no. [Inaudible].
>> Alexander Szalay: Can I...
>> Yan Xu: Or we can lighten the question and just give him some really
good remarks of what can be done with commercial and inexpensive
hardware, any comments on disruptive technologies like quantum
computing or optical computing?
>> Alexander Szalay: I would say memory stores. They are actually much
closer to reality than, I think, quantum computing will be in our
lifetimes or in my lifetime. Memory stores will rewrite computer
science if they work, if they become practical because every memory
element will be also able to arithmetic operations. So all of the
algorithms we can throw out if that works. So it will be an interesting
world if that comes to --.
>> Yan Xu: Okay. Thank you very much and let's clap for the
panelists....
[ Audience applause ]
Download