Huawei`s aptly-named Universal Distributed Storage (UDS) solution

advertisement
Huawei’s aptly-named Universal Distributed Storage (UDS) solution will soon become a
key element in the University of California High-performance AstroComputing Center’s
efforts to understand dark matter, dark energy, and the other unanswered questions in
astrophysics. Its director recently sat down with Huawei to shed some light on these
matters.
The University of California High-performance AstroComputing Center (UC-HiPACC) is
where astronomers, astrophysicists, and computer scientists develop & test
mathematical models of how the universe works. Such models might involve entire
galaxies, or clusters of galaxies, and simulate billions of years of interaction. The
numbers involved in modeling of such complexity and scale are, quite naturally,
astronomical, and have been greatly straining the center’s computational & storage
resources as of late. Its esteemed director and one of the principal originators and
developers of the theory of Cold Dark Matter, Professor Joel Primack, recently
discussed with Huawei some of the big unanswered questions in the field of
astrophysics while illustrating how a storage solution of a new order of magnitude is
needed for their answer.
Huawei: What do you think will be the biggest areas of discovery in astrophysics in 2012?
Primack: Well, there was a discovery that started the year off; it got me and other
astrophysicists quite excited. My colleagues at Santa Cruz discovered the first cloud of
gas that, as far as we can tell, is primordial. There’s only hydrogen and helium, which is
what came out of the Big Bang, with no heavy elements at all. Furthermore, they were
able to measure from the spectra… they saw this cloud in the line of sight of a distant
quasar, and they measured the amount of deuterium (heavy hydrogen) and it agreed
with earlier measurements in samples that had been somewhat contaminated with
heavier elements, so we weren’t sure whether or not it was really primordial. But now,
we are getting the same answer from this truly primordial cloud. That is a way to
measure the total amount of ordinary matter in the universe, and it confirms the results
that we get from many other methods. So, this was an important test of whether we
really understand how the Big Bang worked and how we got from there to here.
Now, the second thing is that we might be on the verge of discovering the nature of the
dark matter, the mysterious stuff that makes up most of the mass in the universe.
There’s no question that there’s a particular kind of data that we are getting -- we don’t
know yet what it means, but it’s very interesting. We are seeing, using NASA's Fermi
Gamma-ray Space Telescope, that there are gamma rays (very energetic radiation)
coming from the center of the Milky Way galaxy. The energy is 130 billion electron-volts;
that’s about 130 times the energy that you would get if you converted all of the mass of a
proton into pure energy. One hundred and thirty times that. Furthermore, it only comes
from the center of the galaxy. There’s also less convincing reports that we are seeing the
same radiation from clusters of galaxies nearby. Now this is just the sort of thing that we
would expect if there is dark matter that is either decaying or annihilating. And this could
be a tremendous clue to the nature of this invisible stuff that is most of the mass in the
universe – the dark matter, which is very different from the ordinary atomic matter. So,
there’s no question that we’re seeing this radiation. It’s more than a five-sigma discovery.
Is it dark matter? That is what we have to find out, and that may not be discovered this
year, but it’s a very interesting possibility.
Huawei: What are the challenges that you face with IT infrastructure when doing your
research?
Primack: There are two big challenges – computing and data storage & transportation.
The computing challenge is that the supercomputers are becoming more and more
inhomogeneous.
High-performance computing used to be, simply, single
processors hooked together but basically running the same single processor code on
many different processors and just spreading the work around. The new computers have
multi-core chips with many cores addressing the same memory. And then, often they
have accelerators; for example, graphic processor units (GPUs), also addressing the
memory. The way you have to program these things to be efficient is quite complicated.
You have to arrange to have the pipelines of the GPUs always filled to get the highspeed advantage, and also, since they are basically vector machines, they do “multiply add, multiply - add” very fast, but you have to give them only that sort of job to do and
not the things that they do poorly – those you have to use the cores for. So, you have to
cleverly design the architecture of your program to take advantage of the complicated
architecture of the machine, and then often you have to spread the work over a number
of these nodes. So that makes programming much more complicated than it used to be.
We can generate an enormous amount of data, either analyzing huge amounts of data
or as output from supercomputer calculations, and it becomes very difficult to analyze
the data on the fly. You want to be able to store it and then think about how you want to
analyze it as you start to understand the data. And the difficulty is that we can very
quickly generate more data than is easy for us to store at the big supercomputer centers.
For example, last year we ran a simulation where we stored 7200 time steps and it was
340 terabytes of data. We dumped it onto their fast disk array, but they only let us keep it
there for a few weeks. They said that at that point they would not archive it because it
was too many tapes and they didn’t think that we were ever going to read those tapes,
and they didn’t give us time to fully analyze it. So, we simply lost it. We would like to be
able to store outputs somewhat longer and at least have a chance to give them a little bit
more thoughtful analysis, and so having a larger capability for storage is really quite
important.
Huawei: How will cloud computing affect research in your field?
Primack: Well, the cloud is going to give us more flexibility in how we handle large
datasets. Let me give you an example. As head for the High-Performance
AstroComputing Center for the entire University of California (UC) system, I called
together a large number of the leaders in the world in doing high-resolution galaxy
simulations, and all of the main groups were represented.
We had this meeting a few weeks ago at UC-Santa Cruz and we were able to get the
entire group to agree to do simulations with the same initial conditions. We had just been
able to create the initial conditions in a format that every code could use, and this was a
tremendous achievement. Everybody was impressed that this suddenly became
available. This was brand new technology. Also, we got everybody to agree on a lot of
the other aspects of the code -- the way cooling would be implemented, the way the
energetic background radiation would be implemented, and this was partly because we
had the world’s leading experts at the meeting. And everybody saw that if everybody
agreed on this, then we were going to disagree on the way that we do a lot of things
within the different codes, but at least we will be able to narrow down the causes of
differences in the outputs. And then finally, another new tool just became available for all
of the different code outputs – a uniform analysis tool which is going to allow us to
analyze all of the simulations in the same way.
Now, a crucial step is that we have to have a workspace where everybody can share the
inputs and the outputs, together with all of the capabilities of simultaneously analyzing all
of the outputs. We’re going to compare the simulations at many different time sets. And
so we announced that the Huawei’s UDS system is going to be available at Santa Cruz
and we’ve set aside up to 100TB just for this project. And so this is a perfect cloud
project because everybody is going to have access to this part of the data system. We’ll
all get to see each other’s data and see the analyses. And I think this is going to
advance the entire field. At any rate, this is going to be a worldwide collaboration and I
think that we’re going to learn a great deal in the next year.
Huawei: In order to handle the high volumes and throughput of the experiment, what
measures has the High-performance Astrocomputing Center taken to improve its
storage system’s performance?
Primack: Well, at UC Santa Cruz we are looking forward to getting a new astrophysics
computer which will have at least 5,000 cores and many GPUs and a fair amount of
Flash, and combine that with the UDS system from Huawei. This is going to
tremendously increase our computational and data storage capability and that is exactly
the combination that we need to analyze data. And, as I said, we are going to make this
available to people within the UC system and, for some purposes, to a worldwide
community of astrophysicists.
Huawei: Other than UDS, are there other technologies from Huawei that you are
interested in looking at?
Primack: Well, we want to buy an inhomogeneous computer system that will have
several sub-components. There will be one part, for example, that will have a very large
RAM-per-core, because that is very useful for certain kinds of calculations and also for
making videos. And we’ll have another part that will have GPUs and lots of cores
addressing the same RAM. And then we’ll have other units which are basically just a
bunch of individual processors because a lot of people are still running embarrassingly
parallel projects. So, it has to be an inhomogeneous system that will serve the needs of
a variety of users.
The way that we would like to buy the system, now that the money is in hand. We
actually have the NSF (National Science Foundation) money and a start data of
September 1st of 2012, so we have to move quickly now. We want to, first of all, ask a
number of vendors to tell us how they think we should put the system together. We’re
going to specify what the capabilities of the system are but without specifying the details.
And then, after we have been educated by the vendors as to what’s available, then we’ll
do a second round where we are going to have fairly specific specifications and then we
are going to compare vendors head to head and see who is going to give us the best
solution. Our experience from doing this a few years ago is that each of those two
stages takes about two months. So what we are hoping is that by January of next year,
we’re going to have a system in place that will integrate very well with our Huawei UDS
system.
Huawei: So looking a little farther out, what do we expect to see in astrophysics in the
next few years?
Primack: Well, we don’t know what the universe is made of. We have some idea of how
the invisible stuff controls the universe, the dark matter and the dark energy, but this is a
crazy situation, where we know some of the rules for how we can simulate the universe
but we don’t really know what this stuff is. And this is most of the universe, after all. So,
what we are hoping is that we are going to discover the nature of the dark matter.
We have seen clues and, if we are lucky, we’ll start to make the dark matter at the Large
Hadron Collider (LHC) in Geneva. We will see the same mass particle, for example, in
annihilation. For example, from gamma rays from the center of the galaxy and elsewhere.
And we will start to discover the dark matter particles in deep underground laboratories.
They are always put deep underground to shield them from cosmic rays, because they
have to be incredibly sensitive. And if we get the same properties… the same mass, for
example, for the dark matter particle, in each of these different experimental arenas,
then we’ll know that we’ve got it. This is going to take years, because each of these
experiments is very difficult. But, this is one of the great quests to understand what the
universe is made of.
The dark energy is even tougher, but the key to discovering the nature of the dark
energy, which is what makes the universe expand faster and faster, is to determine,
extremely precisely, the expansion history of the universe and also how structure grew in
the universe. How galaxies formed, and clusters, and so forth. And this is going to
involve observations with a precision that has never been achieved before. A percent…
a tenth-of-a-percent precision. It used to be that people said that if you got things within
a factor of two, you were doing very well in astronomy. That’s not modern astronomy. It
is now a precision science and we need this sort of precision in order to discover the
nature of the universe.
Incidentally, the dark energy is going to determine our future – whether the universe is
going to expand faster and faster and most of the universe becomes invisible to our
distant descendants or, if possibly, it slows down and turns into a different kind of
universe. If we understand the nature of the dark energy, we’ll understand the future.
And this is something that we can do in the next decade or so and it’s going to require a
tremendous amount of computing and a tremendous amount of storage.
Huawei: So what’s your opinion about the cooperation between the High-performance
Astrocomputing Center and Huawei?
Primack: Well, we are looking forward to using the UDS system. We are going to put it
into production just as soon as we get our hands on it, because we rather desperately
need it. And we are going to be able to evaluate it in real time. We are going to be using
it very extensively, not just for my own work but also for the work of many other
scientists at UC-Santa Cruz and the University of California in general. Thank you for
giving us this wonderful system.
Dr. Joel R. Primack is a distinguished Professor of Physics at the University of California
at Santa Cruz and specializes in the formation and evolution of galaxies and the nature
of the dark matter. After helping to create what is now called the "Standard Model" of
particle physics, Primack began working in cosmology in the late 1970s and became a
leader in the new field of particle astrophysics. He has authored more than 200 technical
articles, as well as a variety of works aimed at a broader audience. He is also a Fellow of
the American Physical Society and the American Association for the Advancement of
Science (AAAS). Primack played the principal role in creating the AAAS Congressional
Science and Engineering Fellowship program and the AAAS Program on Science and
Human Rights.
Download