A draft allocation request form is at the end of this document

advertisement
NOAA R&D HPC Allocation
Introduction
The NOAA Recovery Act Climate Computing/Modeling Project managed by the
NOAA Office of the Chief Information Officer accelerates the implementation of
NOAA’s High Performance Computing (HPC) Strategic Plan developed in Oct 2008.
Funding from the American Recovery and Reinvestment Act (ARRA) of 2009 is
being used to acquire two large-scale HPC systems for research and development
(R&D), the data center space to house these systems, and the associated advanced
networking. The new HPC systems to be located in Tennessee and West Virginia
will replace the existing R&D HPC systems currently located at the Geophysical Fluid
Dynamics Laboratory (GFDL) in Princeton, NJ, the Earth System Research
Laboratory (ESRL) in Boulder, CO, and the National Centers for Environmental
Prediction (NCEP) with systems in Gaithersburg, MD.
This document describes the process for allocating time on NOAA’s R&D HPC
resources that align with NOAA’s new process for Strategic Execution and
Evaluation (SEE). It is expected the current NOAA Administrative Order for the
Management and Governance of HPC (NAO 216-110) will be updated according to
the information contained herein.
Background
The R&D-purposed HPC acquired with ARRA funding represents the first time that
all of NOAA’s large-scale supercomputing is remotely located from its traditional
HPC users (OAR/GFDL, OAR/ESRL, and NWS/NCEP). Up to this point, allocations on
NOAA R&D HPC systems have largely aligned with users and R&D projects that were
co-located with those systems. Now, a robust allocation process has been
implemented that not only reflects the established activities that require the use of
NOAA R&D HPC but also maximally utilizes the petascale computing the ARRA
funding has provided in Tennessee and West Virginia.
It must be recognized that the ARRA funds were provided to fill "critical gaps in
climate modeling … for continuing research into the cause, effects, and ways to
mitigate climate change.” (from the ARRA language). Already, about 75% of NOAA’s
current R&D HPC allocation addresses climate issues. However, it must also be
recognized that the two ARRA-funded systems must serve all of NOAA R&D since all
NOAA R&D HPC funding is dedicated to supporting them through the end of their
system life. The challenge before the Allocation committee is to weigh established
use with the new opportunities in environmental modeling that the ARRA
computing represents and the stated purposes for which it was intended. Those
purposes have usually included statements from NOAA management that the
Tennessee system is for Climate Research and Modeling and the West Virginia
system is for Climate Model Development for Operational Predictions.
The Process
The NOAA HPC Board Allocation Committee is tasked with the following:



Review Information
o The latest allocation utilization reports
o New project allocation requests
o Update on system configuration and status
Analyze Information
o Are there sufficient resources available for allocations to be adjusted?
o Is there an impact of system changes on the allocations?
o Are there system changes that would better achieve the allocation
plan?
Develop/Adjust Allocations based on
o Established evaluation criteria
o HPC system performance and capability (e.g., chronic deficiencies in
availability, new resources, etc)
The following represents a proposed set of core principles for guiding allocation
decisions on the NOAA R&D HPC systems:
 The process for setting allocations will be open and transparent
 Allocations must be targeted to address NOAA’s Mission Objectives and
support the associated Evidences of Progress (EOPs) within those
Objectives1
 Long-term, proven strategies are essential to solving NOAA’s difficult science
challenges
 Allocations are granted only to viable research ideas that show a clear
scientific methodology for meeting the project’s objectives and goals and
demonstrate a clear need for HPC
o NAO 216-110 defines HPC as the unified system for solving NOAA's
largest computational problems, composed of supercomputer systems
and associated communications, analysis, visualization and storage
systems, and application and systems software with all components
well integrated and linked over a high speed network.
 All HPC requirements that have funding (for acquisition, operations, and
maintenance) will be guaranteed an equitable share of NOAA’s HPC
resources.
 Allocations will be categorized as:
o Primary: A base amount of resources is to be assigned to projects
aligned with NOAA’s primary modeling centers (currently GFDL,
NCEP, and ESRL) without the uncertainties and risks of annual or
short-term fluctuations, much like observational/monitoring
responsibilities are dedicated and sustained by NOAA
If the Integrated Modeling Enterprise Objective collects all HPC requirements
under the SEE process, all HPC-relevant EOPs should be reflected in the Modeling IP.
1





o Secondary: In support of other projects led by NOAA organizations
that require access to HPC resources.2
o Tertiary: In support of dedicated projects by non-federal cooperative
institute or cooperative agreement centers
o Quaternary: Competitively selected projects from universities or
federal centers, targeted towards the Mission objectives
Resources will be available for high-risk, high-payoff research through a
competitive high-performance computing proposal process
Resources will be available for System Operations (software engineering,
maintenance and testing) and the development and testing of software tools
to support advanced computing R&D activities
Aside from the important exception of developing portable and efficient
models, projects should be assigned to a single system to maximize the
efficiency of model development and execution
The scientific merits of projects by the external community (including NOAA
Cooperative Institutes, Climate Program Office grantees, and Universities)
must be assessed in the same manner as for the NOAA internal units
Secondary, Tertiary , and Quaternary allocations are awarded on a yearly
basis; all allocations are enforced monthly
When deciding allocations, the Allocation Committee might consider the answers to
the following questions:






Does the allocation request address NOAA’s mission by directly linking to an
NGSP Goal Objective and EOP? (Relevance)
Has a Line Office identified the project as a priority for execution through the
SEE process? (Priority)3
Does the proposed project utilize proven strategies for solving difficult NOAA
science challenges? (Experience)
Does the research idea embodied in the modeling use demonstrated
scientific and computational methodologies, is it viable, and does it
demonstrate a clear need for HPC? (Readiness)
Is the requested allocation consistent with the scope of the proposed
research? (Alignment)
Does the proposed project have funding for Acquisition or Operations &
Maintenance? (Funded)
It is assumed that some of the projects receiving a Secondary allocation will
eventually receive a Primary allocation as their useful application to NOAA’s mission
is demonstrated.
3 LO prioritization will help the Allocation Committee choose the highest priority
projects to run on what is almost certainly going to be a system too small to meet all
requirements.
2
HPC allocation requests are evaluated based on a set of criteria, using adjectival ratings. Examples of how each criterion might
merit each rating is provided in the following table:
Criteria
The allocation request
addresses NOAA’s mission by
directly linking to an NGSP
Goal Objective and EOP
(Relevance)
A Line Office has identified the
project as a priority for
execution (Priority)
Outstanding
The request meets multiple
NOAA Goals and Objectives
Acceptable
The request meets a NOAA
Goals and Objective
Unacceptable
The request does not meet
a NOAA Goal and Objective
Documentation from SEE
activities has identified this
project as the highest priority
Documentation from SEE
activities has identified this
project as a priority
The proposed modeling utilizes
proven strategies to solving
difficult NOAA science
challenges (Experience)
The research has demonstrated
scientific and computational
methodologies and
demonstrates a clear need for
HPC (Readiness)
The requested allocation is
consistent with the scope of the
proposed research (Alignment)
Project has already made
significant contributions to the
NOAA mission over a
sustained period of time
The project has efficiently
used significant HPC resources
over a sustained period of time
with successful results
Project utilizes well-known
strategies with a strong
potential to contribute to the
NOAA mission
Project has a strong potential
to efficiently utilize HPC
Documentation from SEE
activities has not
identified this project as a
priority
No evidence the proposed
project will achieve the
desired outcome.
The project is a
continuation/extension of a
current activity that has a
record of fully utilizing its
allocation
The project will fully fund the
acquisition and ongoing
operations and maintenance of
additional resources
commensurate with the
request
The allocation request
appears consistent with the
science goals but has no
record of accomplishment
The proposed research
requires less than the
requested allocation
The project will fund
acquisition of some additional
resources and/or contributes
to the operations and
maintenance of the R&D HPC
The project contributes no
funding to the R&D HPC
The proposed project has
funding for acquisition or
Operations & Maintenance
(Funded)
The project does not
require HPC or has a poor
plan for utilizing HPC
The following table may be useful when evaluating individual allocation requests
using the above key:
Criterion
Rating
The allocation request addresses NOAA’s
mission by directly linking to an NGSP
Goal Objective and EOP (Relevance)
A Line Office has identified the project as
a priority for execution (Priority)
The proposed modeling utilizes proven
strategies to solving difficult NOAA
science challenges (Experience)
The research has demonstrated scientific
and computational methodologies and
demonstrates a clear need for HPC
(Readiness)
The requested allocation is consistent
with the scope of the proposed research
(Alignment)
The proposed project has funding for
acquisition or Operations & Maintenance
(Funded)
Additional factors, such as whether a project can be accomplished with fewer
resources than requested (scalability) and a project’s past performance can be
considered in a project’s evaluation. Although the intent of the adjectival ratings is
to promote a subjective evaluation of the projects, the Allocation Committee can
also consider objective criteria, e.g., any project with an “Unacceptable” rating in any
of the criteria outlined above could be rejected.
A draft allocation request form is at the end of this document. It is anticipated that
the granularity of projects will be such that about 10-15 requests will be evaluated
each year.
Once the evaluation of individual requests is complete, the Allocation Committee
will rank the requests within each of the allocation categories: Primary, Secondary,
Tertiary, and Quaternary. As noted above, it is not expected that Primary allocations
will change as often as the other allocation categories. The package of allocations
recommended to the HPC Board will consist of the set of projects to be run on the
R&D HPCS systems and the resources given them.
Schedule
The critical schedule event in Fiscal Year N (FYN) for the HPC Allocation Process is
the February release of the President’s Budget for the following Fiscal Year (FYN+1).
At that time, the allocations for projects funded in the PresBud can be determined.
The following is a proposed schedule for the Allocation Process in FYN:
Month (FYN)
February
June
July
August
September
Milestones
 PresBud for FYN+1 released
 Allocation Board receives requests for
use of R&D HPC resources for FYN+1
 Allocation Board completes its review
of the latest (Q2) allocation utilization
report and HPC system configuration
and initial review of HPC requests.
 Allocation Board issues call for
revisions to HPC requests as needed
 SEE mid-year FYN Execution Review
complete
 Allocation Committee receives revised
HPC requests
 Allocation Board completes its review
and analysis of ongoing and proposed
new R&D HPC projects
 Allocations for FYN+1 complete
HPC Board Allocation Committee Membership
Download