High Performance Super-Computing Systems Research Facility

advertisement
FOR IMMEDIATE RELEASE
Contact:
Shannan Yeager
New Mexico Consortium
Phone: 505-412-6898
Fax: 505-212-0049
syeager@newmexicoconsortium.org
www.newmexicoconsortium.org
World’s First Large Scale High Performance Super-Computing Systems Research
Facility Opening at New Mexico Consortium
PRObE Center allows computer scientists to test super-computing and big data systems
software scale.
Los Alamos, NM, October 15, 2012 – The New Mexico Consortium (NMC), Los Alamos
National Laboratory (LANL), Carnegie Mellon University (CMU), and the National Science
Foundation (NSF) have partnered to create the PRObE Center (Parallel Reconfigurable
Observational Environment), a one-of-a-kind computer systems research center located at
Research Park in Los Alamos. A grand opening celebration is scheduled for Thursday,
October 18, 2012, from 1:00 until 3:00 p.m. at Research Park in Los Alamos. U.S.
Congressman Ben Ray Lujan, LANL Director Charlie McMillan and NSF representative Dr.
Keith Marzullo are highlighted as keynote speakers for the event.
Using $10 million provided by the NSF, along with 2,048 recently retired computers from
LANL, the PRObE Center will be the world’s first facility where computer systems
researchers have access to a dedicated large scale supercomputer where disruptive -- and
even destructive -- testing can be done.
Currently, high performance computing research is limited to using small clusters, or
renting virtual machines in large, shared cloud clusters, to test the systems they
develop. When supercomputers are built, they go directly into production for their
intended application, such as climate modeling, leaving no time for systems researchers to
try out new concepts in systems software like operating systems, network software, and
storage software. There have been no supercomputers available for computer scientists to
work out issues with the operating systems or other systems software. As supercomputers
get larger and more complex, providing reliable and efficient system level software
becomes a daunting task; without a place to test new concepts, advancement in high
performance computing will be delayed.
Several years ago, LANL’s deputy division leader for High Performance Computing Gary
Grider was working to decommission some old supercomputer hardware when it occurred
to him there might be a better solution. “I realized our retired machines might still have
some value for other purposes than providing cycles for science,” he says. “I had this idea
that there ought to be a way to reuse these things. One way would be to help systems
researchers who never get a chance at a dedicated large resource.”
After developing this idea with colleagues at Carnegie Mellon University, Grider started a
conversation with the NSF, which agreed that this was a great opportunity to collaborate
and build something that would have enormous, positive implications worldwide. With a
$10 million grant from the NSF, the PRObE team got to work on creating the Center, housed
at the New Mexico Consortium’s offices in Los Alamos.
PRObE is a joint effort of the LANL, Carnegie Mellon University and the New Mexico
Consortium. After two years of construction, moving computers and testing operations, the
PRObE Center is now ready to begin hosting researchers with it’s 2,048-core cluster,
located at the NMC in Los Alamos. A smaller cluster is located at Carnegie Mellon in
Pittsburgh, PA. All of these are recycled machines donated by LANL. The testing
management software is provided by the University of Utah’s Emulab Group.
Katharine Chartrand, the Executive Director of the New Mexico Consortium, who has
worked on the project with Grider and the NSF from the very beginning, is thrilled to see
the culmination of their efforts.
“Re-purposing a supercomputer is hard. Building the first computer science research
environment of this scale is an experiment in itself. I am amazed at the commitment of this
partnership -- the institutions, the individuals, the NSF -- to making this resource available
to the nation. It takes a tremendous, sustained and shared commitment to make a program
of this complexity work,” Chartrand stated.
Dr. Garth Gibson, a professor of computer science and electrical and computer engineering
at Carnegie Mellon University (CMU), has been an integral part of the collaboration. CMU is
a renowned leader in computer systems research; Dr. Gibson immediately saw the value in
the PRObE proposal, and was eager to lend his expertise to the project.
“Unless they leave universities for government or industry jobs, researchers and students
rarely have access to these expensive large-scale clusters,” Gibson explained. “That means
they don’t get the training and education necessary to develop innovations for the fastapproaching era of exascale computing.”
“Moreover, when a supercomputer is new, it’s immediately needed for applications
research,” Gibson continued. “So even when they do get permission to use larger clusters,
systems scientists can’t run experiments on low-level hardware and purposely break these
machines to see what happens.”
Researchers will be given dedicated use of the PRObE clusters for days, even weeks at a
time. They will be allowed to replace any and all of the code and even inject faults that
might be destructive to some equipment.
The idea for PRObE originated with LANL but they approached the New Mexico
Consortium to operate the Center.
“LANL isn’t in the business of supporting academic research and didn’t have authority to
foot the bill to house, power, cool and maintain these old systems,” Grider explained. “We
run a supercomputer complex to do nuclear weapons calculations,” he says. “This is an
offshoot thing that isn’t in our mission.”
PRObE includes a focused educational component targeting undergraduate students
nationally. The Computer System, Cluster, and Networking Summer Institute is a 9-week
technical intensive that emphasizes practical skill development in setting up, configuring,
administering, testing, monitoring, and scheduling computer systems, supercomputer
clusters, and computer networks through a variety of activities including hands-on
technical training, lectures, professional development seminars, and tours of LANL
facilities. This innovative and highly successful program was developed and piloted by
LANL's Information Science and Technology Institute (ISTI). It has been incorporated into
the PRObE Center and is now jointly managed by ISTI and the NMC.
The NSF awarded a $10 million grant for PRObE operations to the New Mexico Consortium
in October of 2010. The super-computer facility was completed a year later, in December
2011, with the help of local high school and college students. The computers are now
configured, and researchers worldwide may start applying to use the 2,048-node facility.
The New Mexico Consortium is a non-profit partnership between the University of New
Mexico, New Mexico Institute of Mining and Technology, and New Mexico State University.
The NMC facilitates research and educational collaborations between universities and Los
Alamos National Laboratory.
For additional information, contact Shannan Yeager at 505-412-6898,
syeager@newmexicoconsortium.org.
###
Download