Contributors from the Community - Science

advertisement
An Integrative, Cross-Foundation Compute and Data Infrastructure
for Science & Engineering Research
“Multiple Authors in the Community1”
Concept: In an era where science and engineering communities have become digital, and
computational and data science is ever more critical to progress, the nation will benefit from a
strong, predictable path for supporting computation- and data-intensive research across a broad
spectrum of research efforts (e.g., individuals, groups, communities, and projects, including
MREFCs), requiring diverse capabilities (e.g., HPC, HTC, data and cloud) and magnitudes
(current HPC facilities are saturated, while the nation lacks national data services in an
exponentially growing big data era). Current needs for computing and data resources are
outstripping current capacities and with no clear long-term plan in place, research communities
are both underserved and uncertain of the availability of the national resources and services in
the future. Further, investments made in other regions have created a situation in which US
researchers have access to less resource than their competitors internationally—a significant
reversal of conditions. This disadvantages US researchers.
To address the challenges facing our nation, research requires more deeply integrated
approaches encompassing theory, experiment and/or observation, data and simulation science,
and crossing disciplinary boundaries. Yet
It is difficult to adopt a concerted
uncertainties in the availability of national
synergistic
strategy for conducting
computing and data resources and services lead to
computational studies. There is no unique
fragmentation, not integration, as individual
one-stop shopping place for resources and
major research projects prefer to develop their
services for the US science community.
own solutions; they cannot plan on synergies with
NSF funds XSEDE (providing a fairly
other national resources that may not be available
modest amount of resources) and Blue
in the future. It is also substantially more
Waters (requiring a completely different
expensive as research groups reinvent individual
route for requesting resources); DOE funds
cyberinfrastructure (CI) solutions over and over
INCITE (providing considerable resources
but constrained by the type of problems that
again. Further, the needs may be big enough that
can be considered); the single NIHindividual research groups could not possibly
supported
resource, is Anton at the PSC
reinvent solutions. A long term commitment to an
(providing
small allocation but on a very
integrated national computing and data
efficient computer, but through yet another
infrastructure is needed that recognizes the
mechanism).
importance of these resources for scientific and
engineering progress as well as the importance of
the technical culture, expertise and staff that can take years to develop around major computing
and experimental facilities.
We propose a new paradigm that develops a strategy for building, maintaining and operating a
national computational and data-intensive resource infrastructure as well as the critical services
1
See list of contributors in “Contributors:” on page 10
that connect them. We call this the National Compute and Data Infrastructure, or NCDI.
Paradigms around which to develop a path forward are available for research communities
through such activities as the MREFC and the National Center for Atmospheric Research.
Although modifying these approaches to be conducive for Foundation-wide activities represents a
challenge, the paradigms are well understood and tested.2 It is clear that such an approach is
needed by all of NSF’s Computational and Data-enabled Science and Engineering (CDS&E)
communities, and numerous MREFC projects.
Indeed, this approach need to be strongly driven by and coupled to Grand Challenge
communities addressing complex interdisciplinary problems. To be effective, this approach
should encompass a small number of large-scale computing facilities to support the most
compute- and data-intensive research, coupled with a commensurate number of mid-range
facilities serving the diverse needs of multiple research communities, as well as national-scale
data repositories for multiple communities, high-speed networks to connect to them, and
additional services that integrate these
capabilities, including integration with the
It is very important to have both large-scale
growing environments on university campuses.
and mid-range facilities round computing,
Further, many research universities are
data and instrumentation. There are
leadership-class problems (some of Klaus
struggling to define and support campus-level
Schulten’s recent projects on Blue Waters are
resources and are also going regional. Having
good example), which are pushing the
NSF lead through a coherent approach would
envelope of what can be done, but there are
have direct impact by (a) providing a blueprint
also numerous mid-range problems that are
for regional efforts to consider and (b) pooling
mainly limited by availability of mid-range
funds to increase leverage
resources (e.g. in silico drug scoring by free
energy methods, or conformational search for
discovering druggable sites in proteins, etc).
The industry is increasingly interested in this
latter type of mid-range computations, but
they are waiting to see if it is worth their
investment. This research needs to be
sustained from the academic side to become
fully realized (otherwise this will probably
fall through or be delayed by a decade).
The criticality of a balanced portfolio of
resources and services (compute, data and
human!) cannot be overstated.
There is
considerable diversity among disciplines
regarding their requirements, and possibly even
their understanding of their own requirements.
Focusing too strongly on any segment of this
portfolio often results in constraining the success
of significant segments of the research
communtiy. For example, focusing too strongly on leadership class facilities often results in
consolidation not only of exposure to the facilities but also in the success of research groups. In
some cases, this can have very positive side effects, particularly as data repositories are made
available, but in many other important aspects it simply shuts out a vast majority of researchers
(particularly early stage). In many fields, the vast majority of research is conducted either at the
"medium" scale, or on data repositories from leadership class simulations (which are few and far
2
Note the MREFC vehicle is used for multi-agency initiatives, e.g., between NSF and DOE, where MREFC and
CD processes have to be intertwined. So this could in principle provide a mechanism for, say, NSF-NIH-DOE
cooperation in providing a national computing and data infrastructure.
2
between, and for the most part located in Europe.)
Though there is an understanding within many CDS&E communities, it needs to be made clear
that high-end computing and big data, at least for many disciplines, are mutually reinforcing. For
example, in climate modeling, the standard workflow is to consume substantial HPC resources in
a batch compute mode and store the enormous output, primarily on tape, for later analysis and
synthesis. The bigger the HPC resource, the more data that is generated, archived and retrieved
multiple times by multiple researchers for a variety of different types of post-production
interrogation. The larger and more complex the data, the more HPC resources required to do the
analysis. This feedback loop represents an additional argument for a coherent cyberinfrastructure
addressing computing, data and the critical support services associated with delivering these
capabilities to the nation’s researchers.
The proposed NCDI would combine some aspects of the current NSF advanced computing
(Track 1, Track 2, and XSEDE) investments, coupled with new data and networking services,
into a major national investment that would integrate these capabilities distributed across a
number of sites. The proposed NCDI embraces this diversity, driven by community needs, as a
core value. The next frontier in science is to produce new CI-based scientific environments with
emergent properties that are capable of making science a true collective endeavor (from data
generation to analysis and publication). In some areas of biology, speed of data acquisition
outpaces computational analysis. In other areas, computation generates a wealth of electronic
information waiting for experimental validation. This dichotomy requires developments of
different models of CI management and sharing of information.
Discussion: The NCDI is envisioned as a critical outcome of earlier CIF21 and other NSF
programs that will unite major thrusts into an integrated framework that would benefit
researchers, ranging from those working within the long tail of science to those at the extreme
high end in fields all across NSF, from individual university campuses to the largest instruments
that are being internationally developed and deployed. The thrusts are CI programs that are part
of CIF21 relating to (a) computational, (b) data, and (c) networking infrastructure, with (d)
digital services and (e) learning and workforce development programs, (f) all driven by grand
challenge communities3 that connect to those activities on campuses, in research communities
and in existing and future MREFC projects. The science-driven community efforts in each of
these areas are maturing and are increasingly interdependent on CI.
The NCDI would provide unique opportunities for researchers to define and use a highly
responsive full-scale CI that reaches from instruments to computational, data stewardship and
analysis capabilities to publications for long-term highly innovative research programs. The
development of community standards and approaches during the MREFC design process—an
ongoing activity in the envisioned NCDI—will accelerate research programs that will come to
fruition during the construction and operation of the full scale infrastructure.
With an NCDI in place, individual MREFC projects, each of which currently develops its own
3
Indeed, under the auspices of the ACCI, six task forces produced reports full of recommendations on each topic!
3
(often duplicative and typically less capable) CI, would then be able to leverage this long term,
stable environment to deploy the needed computing, data stewardship and analysis facilities, and
more deeply integrate the services of NCDI with other MREFCs, enabling funding to be focused
on scientific and engineering issues and better supporting complex problem solving for
disciplinary, multidisciplinary and interdisciplinary research. Where appropriate, other MREFC
projects would be able to plan longer-term investments that explicitly build on the NCDI.
Previously this type of CI in-depth knowledge was available on an ad-hoc basis with limited
cross-domain expertise. NCDI would offer access to comprehensive intellectual resources
previously unavailable to the MREFC cadre of participants. This would more naturally enable
large-scale data services to be co-located with large scale computing facilities, which are needed
in an era of big data, while better serving communities by leveraging commonly needed
infrastructure, staff expertise, and services. The NCDI will enable a new level of scientific
inquiry and effectively support addressing complex problems that will require participation of
multiple MREFCs, compute, and data facilities, such as Multi-Messenger Astronomy. NCDI
would become a national resource, working cooperatively with the university community, for
nurturing, developing, and training the workforce of the 21st century. Like the computational
facilities, the workforce development is fragmented; NCDI could change that in a significant
way.
An examination of the process employed in building and operating research infrastructure and
services, such as the MFEFC process, is instructive with respect to what will work for building
and sustaining NCDI and what needs to be modified from these examples. For example, as the
component technologies evolve with characteristic times much shorter than the standard MREFC
cycle, a set of inter-related design-build/integrate-deploy processes must be running concurrently
with the operation of facilities and services to provide a robust, reliable and secure NCDI to
provide the production needs of the nation’s CDS&E community and upon which other
MREFCs can also build. Although long-term infrastructure is a continuing effort, longer than
normal NSF Research and Related Activities (R&RA) awards, it is also important to encourage
innovation and some level of competition via the proposal process. In order to ensure
effectiveness of resources, NSF could consider options that would allow for a series of
continuing awards, within the MREFC process, based on merit, successful operation, and ability
to meet the various communities’ needs. For example, previous NSF programs have used rolling
5-year awards, with “continue,” “phase out” or “re-compete management” decisions through
periodic reviews. Another example is the fifty years of sustained computational support offered
to the atmospheric and related communities at the National Center for Atmospheric Research
which was strengthened and made more responsive to meeting community needs by periodic
reviews on five-year cycles and re-competition of management. Processes for assessing the
NCDI and the evolution of community computing needs should be a part of these review
processes. Additional activities, such as campus bridging, algorithm development, software
institutes, related disciplinary research, and so on could be supported separately, as they are at
present, but could also be more effective and collaborative by being integrated into NCDI.
These
activities
should
be
tightly linked
to
NCDI.
The
NCDI
will
acknowledge/accommodate/embrace the issues facing university-based researchers that might be
addressed by campus bridging and software activities. For example, full discoverability and
4
interoperability of all relevant data for a given investigation, whether they reside in the cloud, in
a formal data repository or on a researcher’s laptop, has been a holy grail for many disciplines,
but has been achieved at only a very limited and rudimentary level by only a few. Further,
interdisciplinary interoperability remains a dream, despite progress in some areas (e.g.
EarthCube). The NCDI strategy will provide hooks and incentives for the development of data
and algorithmic interoperability. It is understood that the granularity of these activities varies
greatly, so NCDI cannot solve these problems centrally.
It is worth noting here that the context of the spirit and intent of an MREFC was initially to
support infrastructure and services, not the research that uses them. For the MREFC approach to
work here, there will have to be significant agreements reached on what constitutes a shared
infrastructure for research — a common denominator for a broad range of applications. The
simple collection of domain-specific requirements will not constitute the definition of an
MREFC. It must integrate distributed facilities to serve broad needs. As noted above, there is
considerable redundancy in NSF’s CI investments which constitute not only a loss in available
financial resources, but also lost opportunity costs and worse case, balkanization. Striking the
balance that most cost-effectively serves diverse needs will be critical.
In this model, NCDI construction funds could come from the MREFC allocation or another
major budget initiative, with a profile that is estimated to be in the range of $250-300M per year
(of order $3B over a decade). Assuming the NCDI would be managed by ACI, the operating
funds would come from CISE, with contributions from other directorates based on their specific
needs. This would further allow ACI (and other NSF directorate programs) to better support its
portfolio of services, software, development, CDS&E programs, and so on.
Advantages:
 The NCDI will provide the research community with a clear roadmap for planning a
broad range of CDS&E research activities, from individual investigations to major
facility projects;
 The NCDI will provide a means to coalesce community engagement around computing
and data infrastructure;
 The NCDI will provide an integrated approach to data-intensive (Big Data) science. Big
Data services, e.g., astronomy, environmental science, data pipelines, etc., will be more
naturally and effectively supported and can be co-located with simulation services at
major computing sites. This will drive a convergence of big data and big compute
activities at common sites, which will be required as data volumes grow;
 The longer-term stability of the NCDI will enable the retention and building of worldclass technical staff, an intellectual commons for communities to gain and sharing their
experience, and serve as a linchpin for university community to build the workforce of
the 21st century, that is so critical to success in the rapidly evolving computation- and
data-sciences. This is especially important for emerging data services, both at national
and campus levels, that will need a framework that lives long past any individual grants;
 The NCDI will provide a national structure that other MREFC projects can depend on
being in place, enabling them to plan for and take advantage of synergies provided by
leveraging such an infrastructure leading to significant cost efficiencies;
5




Exploiting such synergies, NCDI could reduce the cost of providing the major compute
and data resources and services needed for scientific progress, resulting in savings to both
CISE/ACI and NSF;
The NCDI will enable NSF’s MREFC projects to focus more on science and engineering
issues since they will not have to completely reinvent their own CI and can leverage the
investments in the NCDI for their own purposes;
The NCDI will enable NSF projects to more fully exploit synergies that will be needed to
better support interdisciplinary research trends (e.g., multi-messenger astronomy, which
will require unified or coordinated computing and data services from multiple MREFCs
and communities, such as LIGO, IceCube, LSST, DES, computational astrophysics, etc.);
The NCDI will support interagency cooperation around MREFC projects (e.g., with DOE
and/or NIH) and can be leveraged for deeper cooperation.
Disadvantages:
 A new process needs to be developed or modification of current processes needs to
implemented in a way that can draw on success of previous efforts. For example, the
MREFC process has been designed around a model of building a single instrument to be
used by a single community for a decade or more, and has a very heavy, very long
process of community engagement, conceptual design, preliminary design, final design,
and NSB approval. This process would require modification to accommodate the fast
paced evolution of computing technologies; a process already being developed by the
XSEDE project.
 Significant urgent work will be needed among all of the stakeholders in the NCDI
(research communities, NSF, OSTP and others). What often takes more than a decade
would ideally be compressed into just a few years, and would have to be done at many
levels. This would require dedication of a number of people in the community over a
sustained period of time to define and implement the inter-related, concurrent processes
necessary at all of these levels.
On a related note, if, in fact, the MREFC would be managed by ACI, there is significant concern
in the community about its current organizational location within CISE. The transition of OCI
on the Office of the Director to ACI in CISE has induced a perception that the current situation
with NSF computing has a lot to do with an institutional resistance (on the part of NSF) to
funding infrastructure. Moving away from a cross-cutting OCI toward relocation as ACI within
CISE is seen as a signal that such cross-cutting coherent cyberinfrastructure planning and
investment is of diminishing importance to the Foundation to the detriment of NSF-supported
communities as a whole.
A conceptual plan, whether it is a new approach or a modification of existing paradigms, needs
to be developed as soon as possible. This would need to involve numerous stakeholders and
communities, including existing MREFC projects, to explore how such an environment would
operate. It is likely that such an approach would lead to a significant restructuring of existing CI
centers, university stakeholders, etc. Many assumptions above would need to be quantitatively
validated, and additional issues would need to be considered to determine the viability of this
6
approach. In particular, the need for and benefits of National CI Infrastructure would need to
considered in depth, with not only its scope and cost considered, but its impact on US science
and national competitiveness, training and workforce development, the role of NSF with respect
to other agencies, and more. This concept paper merely aims to raise awareness of the
possibility of such an approach and to highlight issues that would need to be considered in
exploring its viability.
7
Contributors from the Community
Steering Committee
The efforts of developing this concept are guided by a steering committee consisting of:
Dan Atkins
Professor of Information, School of Information
Professor of Electrical Engineering and Computer Science
College of Engineering, University of Michigan
Alan Blatecky
Visiting Fellow, RTI International
Thom Dunning
Co-director, Northwest Institute for Advanced Computing
Affiliate Professor of Chemistry, University of Washington
Irene Fonseca
Mellon College of Science University Professor of Mathematics
Director, Center for Nonlinear Analysis
Department of Mathematical Sciences, Carnegie Mellon University
Sue Fratkin
Washington liaison for the Coalition for Academic Scientific Computation
Sharon Glotzer
Stuart W. Churchill Collegiate Professor of Chemical Engineering
Professor of Material Science & Engineering
Macromolecular Science and Engineering, and Physics, University of Michigan
Cliff Jacobs
Clifford A. Jacobs Consulting, LLC
8
Gwen Jacobs
Director of Cyberinfrastructure, University of Hawaii
Thomas H. Jordan
Director, Southern California Earthquake Center
University Professor & W. M. Keck Professor of Earth Sciences
Department of Earth Sciences, University of Southern California
Steve Kahn
Director of the Large Synoptic Survey Telescope
Cassius Lamb Kirk Professor of Physics, Stanford University
Bill Kramer
Director, @scale Program Office, National Center for Supercomputing Applications
PI, Blue Waters
Anthony Mezzacappa
Newton W. and Wilma C. Thomas Endowed Chair
Department of Physics and Astronomy, University of Tennessee, Knoxville
and
Director, Joint Institute for Computational Sciences
Oak Ridge National Laboratory
Mike Norman
Director, San Diego Supercomputer Center
Professor of Physics, UC San Diego
David Reitze
Executive Director of the LIGO Laboratory, Caltech
Professor of Physics, University of Florida
9
Ralph Roskies
co-Director, Pittsburgh Supercomputing Center
Professor of Physics, University of Pittsburgh
Ed Seidel
Director, NCSA
Founder Professor of Physics, University of Illinois at Urbana-Champaign
Dan Stanzione
Executive director, Texas Advanced Computing Center
Rick Stevens
Associate Lab Director
Computing, Environment and Life Sciences, Argonne National Laboratory
Professor of Computer Science, University of Chicago
John Towns
Executive Director, National Center for Supercomputing Applications
PI and project director, XSEDE
Paul Woodward
Professor, Minnesota Institute for Astrophysics
Director, Laboratory for Computational Science and Engineering
Contributors:
Other contributors include:
Aleksei Aksimentiev
Associate Professor of Physics
University of Illinois at Urbana-Champaign
Scott L. Althaus
Charles J. and Ethel S. Merriam Professor of Political Science
Professor of Communication
10
Director, Cline Center for Democracy
University of Illinois at Urbana-Champaign
Stuart Anderson
Research Manager – LIGO
Dinshaw Balsara
Associate Professor, Astrophysics
Concurrent Associate Professor, Department of Applied and Computational Mathematics and
Statistics
Department of Physics, University of Notre Dame
J. Bernholc
Drexel Professor of Physics
Director, Center for High Performance Simulation
North Carolina State University
Daniel J. Bodony
Blue Waters Associate Professor of Aerospace Engineering
University of Illinois at Urbana-Champaign
Dr. Gustavo Caetano-Anollés
Professor of Bioinformatics, University Scholar
Department of Crop Sciences, University of Illinois at Urbana-Champaign
Santanu Chaudhuri, PhD
Director, Accelerated Materials Research
Illinois Applied Research Institute
College of Engineering, University of Illinois at Urbana-Champaign
Dr. Larry Di Girolamo
Blue Waters Professor
11
Department of Atmospheric Sciences
University of Illinois at Urbana-Champaign
Dr. David A. Dixon
Robert Ramsay Chair
Department of Chemistry, The University of Alabama
Said Elghobashi
Professor
Mechanical and Aerospace Engineering, University of California, Irvine
Charles F. Gammie
Professor, Center for Theoretical Astrophysics
Astronomy and Physics Departments, University of Illinois at Urbana-Champaign
Claudio Grosman
Professor
School of Molecular and Cellular Biology, University of Illinois at Urbana-Champaign
Sharon Hammes-Schiffer
Swanlund Professor of Chemistry
Department of Chemistry, University of Illinois at Urbana-Champaign
Victor Hazlewood
Chief Operating Officer
Joint Institute for Computational Sciences, University of Tennessee
Curtis W. Hillegas, Ph.D.
Associate CIO for Research Computing, Princeton University
Eric Jakobsson
Director, National Center for Biomimetic Nanoconductors
12
Professor Emeritus of Biochemistry
School of Molecular and Cellular Biology, University of Illinois
Peter Kasson
Associate Professor
Departments of Molecular Physiology and Biomedical Engineering, University of Virginia
Jim Kinter
Director, Center for Ocean-Land-Atmosphere Studies
Professor, Climate Dynamics
Dept. of Atmospheric, Oceanic & Earth Sciences, George Mason University
Martin Ostoja-Starzewski
Professor
Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign
Christian D. Ott
Professor of Theoretical Astrophysics
California Institute of Technology
Vijay Pande
Director, Folding@home Distributed Computing Project
Director, Program in Biophysics
Camille and Henry Dreyfus Professor of Chemistry and, by courtesy, of Structural Biology and
of Computer Science, Stanford University
Nikolai V. Pogorelov
Professor
Department of Space Science, University of Alabama
Tom Quinn
Professor
Department of Astronomy, University of Washington
13
Patrick M. Reed
Professor School of Civil and Environmental Engineering
Faculty Fellow, Atkinson Center for a Sustainable Future
Cornell University
Benoit Roux
Department of Biochemistry and Molecular Biology
Department of Chemistry
University of Chicago
Andre Schleife
Blue Waters Assistant Professor
Department of Materials Science and Engineering, University of Illinois, Urbana-Champaign
Barry Schneider
Staff Member
Applied and Computational Mathematics
National Institute of Standards and Technology (NIST)
John Shalf
Chief Technology Officer, NERSC
Ashish Sharma
Environmental Change Initiative
University of Notre Dame
Bob Stein
Professor Physics and Astronomy
Physics and Astronomy Department, Michigan State University
Brian G. Thomas
C.J. Gauthier Professor of Mechanical Engineering
Department of Mechanical Science & Engineering, University of Illinois at Urbana-Champaign
14
Jorge Vinals
Director, Minnesota Supercomputing Institute
University of Minnesota
Gregory A. Voth, Ph.D.
Haig P. Papazian Distinguished Service Professor
Department of Chemistry, The University of Chicago
Shaowen Wang
Director, CyberInfrastructure and Geospatial Information (CIGI) Laboratory
Professor and Centennial Scholar of Geography and Geographic Information Science School of
Earth, Society, and Environment, University of Illinois
Don Wuebbles
Harry E. Preble Endowed Professor
Department of Atmospheric Sciences, University of Illinois at Urbana-Champaign
P. K. Yeung
Professor
School of Aerospace Engineering, Georgia Institute of Technology
15
Download