NATI O NAL O CEAN I C AN D ATM O S PH ER I C ADM I NI STRATI O N NOAA High Performance Computing (HPC) Program Brian Gross Acting Deputy Director, High Performance Computing and Communications Acting Project Manager, R&D High Performance Computing System August 4, 2015 August 4, 2015 NCEP Production Suite Review 1 NATI O NAL O CEAN I C AN D ATM O S PH ER I C ADM I NI STRATI O N NOAA HPC Agenda National Strategic Computing Initiative Governance Performance Funding R&D HPCS Overview WCOSS Overview Schedules Big Data Project August 4, 2015 NCEP Production Suite Review 2 NATI O NAL O CEAN I C AN D ATM O S PH ER I C ADM I NI STRATI O N National Strategic Computing Initiative Executive Order signed July 29, 2015 for a multi-agency strategic vision and Federal investment strategy in high-performance computing Objectives: – – – – – Roles and Responsibilities – – – Accelerate delivery of a capable exascale computing system Connect computing used for modeling and simulation to data analytic computing Establish, over the next 15 years, a viable path forward for future HPC systems postMoore's Law Increase the capacity and capability of an enduring national HPC ecosystem Develop public-private collaboration to share benefits between the US Gov’t and industrial and academic sectors. Lead Agencies: DOE, DOD, NSF Foundational R&D Agencies: IARPA, NIST Deployment Agencies: NASA, NOAA, NIH, FBI, DHS • These will develop mission-based HPC requirements to influence the early stages of the design of new HPC systems and will seek viewpoints from the private sector and academia on target HPC requirements https://www.whitehouse.gov/blog/2015/07/29/advancing-us-leadership-highperformance-computing August 4, 2015 NCEP Production Suite Review 3 NATI O NAL O CEAN I C AN D ATM O S PH ER I C ADM I NI STRATI O N NOAA HPC Governance Structure HPC Board LO DAAs Chaired by NOAA CIO Strategic Management Strategic Execution and Evaluation (SEE) Process Provides Prioritized and Funded Requirements • Oversee Performance and Management of NOAA HPC • Integrate Execution of Prioritized and Funded Requirements Using HPC • Provide criterion and guidance for establishing allocations Allocation Committee Technical Committee NOAA Lab/Center Directors (HPC relevant) NOAA Program Managers (HPC relevant) Chaired by Lab/Center Director (Rotating) (Current Integrated Management Team) HPCC Office, NOAA HPC Site Leads, ITSSO Chaired by HPCC Office Director HPC Resource Management • • • • Resource Technical Estimating Allocation Planning Allocation Execution Monitor & Evaluate Allocations Architecture/Acquisition Management • • • • Acquisition Execution Selection Process Lifecycle Management IT Security NOAA Administrative Order (NAO) 216-110: Management and Governance of High Performance Computing August 4, 2015 NCEP Production Suite Review 4 NATI O NAL O CEAN I C AN D ATM O S PH ER I C ADM I NI STRATI O N NOAA HPC Overview August 4, 2015 NCEP Production Suite Review 5 NATI O NAL O CEAN I C AN D ATM O S PH ER I C ADM I NI STRATI O N NOAA HPC Overview Chart includes FY16 PB increase profile for R&D HPCS August 4, 2015 Chart includes FY16 PB increases profile for R&D HPCS NCEP Production Suite Review 6 NATI O NAL O CEAN I C AN D ATM O S PH ER I C ADM I NI STRATI O N R&D HPCS Overview Development HPC Research HPC Systems Integration Contract (CSC) Interagency Agreement (DOE/ORNL) Systems Configuration Systems Configuration Zeus - Fairmont, WV (GSA Leased Space) –Short-term/seasonal/inter-annual predictions –383 teraflops - SGI Theia - Fairmont, WV (Zeus Replacement) –Short-term/seasonal/inter-annual predictions –1,024 teraflops - Cray Jet - Boulder, CO (NOAA Skaggs Facility) –Hurricane forecast improvement –421 teraflops - Aspen & Cray Princeton, NJ (NOAA/GFDL) –Climate post-processing and analysis –106 nodes (8 core Intel Xeon) – Dell Gaea - Oak Ridge, TN (Oak Ridge National Lab) –Climate change research and projections –1,100 teraflops Cray Performance Measures Performance Measures • May 2010-May 2019 / $317M / IDIQ • 9 yrs with 4-yr base, 4-yr option, 1-yr transition • Minimum 96.0% System Availability • Minimum 99.0% Data Availability August 4, 2015 • Aug 2009-Aug 2016 / $108M / Cost Reimbursable • 5 year agreement extended 2 years • New IA signed June 18, 2015, through FY2020 Titan - Oak Ridge, TN (Oak Ridge National Lab) –Applications for next generation architectures –500 teraflops (2.6M node-hours) allocation of 27,000 teraflops Cray using Nvidia Graphics Processing Units • Minimum 96.0% System Availability • Minimum 99.0% Data Availability NCEP Production Suite Review 7 NATI O NAL O CEAN I C AN D ATM O S PH ER I C ADM I NI STRATI O N WCOSS Overview Facility Locations Contract Award • • • • Awarded to IBM on November 23, 2011 ID/IQ w/ firm fixed price task orders Contract Value of $502 million Period of Performance • 5 year base period (Nov 2011 - Nov 2016) • 3 year option period • 2 year option period for transition Task Orders • Task Order 01 • Initial project management task • Task Order 02 • Phase I Base system 170 TF (2012) • Phase 2 Midlife Upgrade 600 TF (2015) • Task Order 03 • Phase I enhancement 60 TF (2012) • Task Order 04: Cray XC-40 • 2,060 teraflops per site August 4, 2015 Primary – Reston, VA (IBM provided facility) Backup – Orlando, FL (IBM provided facility) System Configuration Identical Systems (per site) – IBM iDataPlex (Sandybridge)/NextScale (IvyBridge) – 830 teraflops per site Performance Requirements – – – – – Minimum 99.9% Operational Use Time Minimum 99.0% On-time Product Generation Minimum 99.0% Development Use Time Minimum 99.0% System Availability Failover tested regularly NCEP Production Suite Review 8 NATI O NAL O CEAN I C AN D ATM O S PH ER I C ADM I NI STRATI O N R&D HPCS Schedule FY2015 FY2016 FY2017 FY2018 FY2019 FY2020 Gaea Current System Planned Recapitalization Maintenance and Enhancement of Capability (FY16 Request - $9M/year) Maintenance and Enhancement of Capability (FY16-19 Request - Ramp Up Profile) Recapitalization DOE IAA / Sustain Operations DOE IAA / Five-year Agreement DOE IAA / Five-year Agreement Zeus/Theia Current System (Zeus) Sandy Supplemental (Phase 1 - Theia) Sandy Supplemental (Phase 2 - Fine-grained) Recapitalization (FY16-19 Request - Ramp Up Profile) Maintenance and Enhancement of Capability (FY16-19 Request - Ramp Up Profile) Jet Current System Annual Enhancements Annual Enhancements Annual Enhancements CSC Option CSC 1 year Option R&D HPCS Integrator follow-on Acquisition August 4, 2015 System Availability NCEP Production Suite Review Transition Contract / IAA 9 NATI O NAL O CEAN I C AN D ATM O S PH ER I C ADM I NI STRATI O N WCOSS Schedule Fiscal Year 2014 2015 2016 2017 2018 2019 2020 2021 2021 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 IBM IDIQ Contract Base period Task Order 002 Phase I Phase I system is retired Phase II Phase II system is retired Task Order 003 Task Order 003 system is retired Task Order 004 NCEP can decide how long it wishes to keep TO4 System TO 4 Award TO 4 Acceptance Task Order 005 (Power) TO 5 Award 3 Year IDIQ Option Period Task Order 006 (Phase III) 2 Year IDIQ Option Transition Period Task Order 007 (Phase IV) August 4, 2015 NCEP Production Suite Review 10 NATI O NAL O CEAN I C AN D ATM O S PH ER I C ADM I NI STRATI O N NOAA’s Big Data Project The Idea • Meet DOC Goal to Transform the Department’s data capacity to enhance the value, accessibility and usability of Commerce data for government, business and the public • Unleash full potential of NOAA data through innovative approaches • Enable private sector to develop new information products and lines of business • Improve compliance with Open Data policy (OMB M-13-13) The Approach • Position NOAA’s data alongside computing and analysis capabilities • Create self-sustainable market ecosystem where industry: • Moves NOAA data to cloud at no net cost to government • Provides public access to original NOAA data • Creates potential for new profitable services Anchor partners established in April 2015 as nucleus around which data marketplaces (Data Alliances) can form (https://data-alliance.noaa.gov/) Results to date • Over 135TB of NEXRAD Level II data moved from National Centers for Environmental Information (NCEI) archive in Asheville, NC to collaborators August 4, 2015 NCEP Production Suite Review 11 NATI O NAL O CEAN I C AN D ATM O S PH ER I C ADM I NI STRATI O N Backup Slides August 4, 2015 NCEP Production Suite Review 12 NATI O NAL O CEAN I C AN D ATM O S PH ER I C ADM I NI STRATI O N NOAA HPC Organization Zachary Goldstein Bill Lapenta Director, NCEP Ben Kyger Director, NCEP Central Operations NCEP Central Operations - Production Management Branch - Shared Infrastructure Services Branch -Systems Integration Branch Mitchell Ross NOAA CIO and Director, High Performance Computing and Communications Director, NOAA Acquisition and Grants Office Brian Gross Acting Deputy Director, High Performance Computing and Communications Weather and Climate Operational Supercomputing System Mike Kane FAC-P/PM Level III Kelly Mabe Director, Strategic Sourcing Acquisition Division R&D High Performance Computing System HPCC Acquisition Support Brian Gross (Acting) FAC-C Level III Mike Kane Bernie Siebers FAC-COR Level III FAC-COR Level III Rene Rodriguez ISSO Jeff Flick ISSO Michael Blumenfeld Contract Specialists R&D Integrated Management Team August 4, 2015 NCEP Production Suite Review 13