The NCI National Environmental Research Data Interoperability Platform to support High Performance access to Oceans- and Marine-related Interdisciplinary Research Ben Evans1, Tim Pugh2, Edward King3, 4, Jonathan Hodge3, Lesley Wyborn1. 1 National Computational Infrastructure (NCI), Australian National University, Canberra, Australia 2 Bureau of Meteorology, Melbourne, Australia 3 CSIRO Marine and Atmospheric Research, Australia 4 Integrated Marine Observing System, Hobart, Tasmania As we exponentially increase data volumes from ocean observation and modelling activities, access to data and the analysis of long-term data archives becomes increasingly challenging. But the oceans community is not alone: all members of the Earth Systems and Environmental communities are facing the same challenge. Therefore we need a solution that will not only enable the oceans community to manage its own data assets, but at the same time facilitate seamless integration of these data sets with data from other communities (e.g., climate, atmospheric, near shore terrestrial and bio) to empower the next generation of high resolution, Data-intensive interdisciplinary research. To progress towards this goal, the National Computational Infrastructure (NCI) at the Australian National University (ANU) has organised a priority set of large volume national environmental and earth systems science data assets on a High Performance Data (HPD) Node within a High Performance Computing (HPC) facility. The node was developed under the Research Data Storage Infrastructure (RDSI) program, which is a component of the Australian Government’s National Collaborative Research Infrastructure Strategy. The colocation of these large volume collections with a high performance and flexible computational infrastructure is designed to support the emergent area of the Data-Intensive Science, whereby HPC analytics can be directly undertaken across the all data content for interdisciplinary analysis. To achieve this, formats need to be self-describing and all attributes need to conform to international standards for vocabularies and ontologies. High Performance access to data is facilitated through direct access on NCI’s supercomputer (Raijin) and cloud (Tenjin), as well through OpenDAP, OGC and other services, and fast programmatically-searchable catalogues. There are 31 (and growing) data collections in the initial ingestion at NCI requiring over 10 Petabytes (PBytes) in storage volume. They are currently categorised into six major fields all related to the environmental sciences: 1) Earth system sciences, climate and weather model data assets and products; 2) Earth and marine observations and products; 3) Geosciences; 4) Terrestrial ecosystem; 5) Water management and hydrology; and 6) Astronomy, social science and biosciences. Properly architected the National Environmental Research Data Interoperability Platform will lead to: A dramatic improvement in the scale, resolution, reach and integration of Australian Oceans and Marine research; Seamless high performance access to nationally significant data collections using new Data-Intensive capabilities to support cutting-edge research methods; and The realisation of synergies with related international research infrastructure programs, particularly those of the Oceans and Marine research domains.