Earth System CoG and the Earth System Grid Federation: A Partnership for Improved Data Management and Project Coordination BESSIG March 18, 2014 Boulder, CO Sylvia Murphy (NOAA/CIRES) (sylvia.murphy@noaa.gov), Luca Cinquini (JPL/NOAA), Cecelia DeLuca (NOAA/CIRES), Allyn Treshansky (NOAA/CIRES) Presentation Outline • • • • • Overview of ESGF ESGF Usage Overview of CoG CoG Capabilities Future Work Overview of ESGF • • • The Earth System Grid Federation (ESGF) is a multi-agency, international collaboration of people and institutions working together to build an open source software infrastructure for the management and analysis of Earth Science data on a global scale. ESGF is a system of distributed and federated Nodes that interact dynamically through a Peer-To-Peer (P2P) paradigm. A client (browser or program) can start from any Node in the federation and discover, download and analyze data from multiple locations as if they were stored in a single central archive. ESGF Usage • • • • • • • CMIP5 (Phase 5 of the Climate Modeling Intercomparison Project)… possibly the largest coordinated scientific modeling effort of all time. » 40+ models, 25+ modeling centers, 17 countries » global, distributed archive comprising 2.5 PB of data obs4MIPs: NASA and DoE observations packaged as CMIP5 output for easy comparison ana4MIPs: reanalysis data CORDEX: regional climate models, 2PB data TAMIP: atmospheric models intercomparison GeoMIP: geo-engineering models intercomparison DCMIP: dynamic core models intercomparison Overview of CoG • • • • • • • CoG is a collaboration environment and hub to connect projects in the Earth sciences. It hosts software development projects, model intercomparison projects (MIPS), and university short-courses or workshops. It includes a configurable search to data on ANY ESGF data node. It provides projects with a wiki and customizable navigation to wiki content. It contains an ontology for the description and management of projects and provides a consolidated look at this content across a project’s network. It contains a file server for documents and images. It provides services for Earth system model metadata collection and display. Some of the 74 projects hosted on CoG include: • Ana4MIPs • Obs4MIPs • National Climate Predictions and Projections Platform (NCPP) • Climate Informatics (University of Michigan) • Earth System Documentation (ESDOC) • NOAA’s High Impact Weather Prediction Project (HIWPP) • Earth System Prediction Capability (ESPC) • Dynamical Core Model Intercomparison Project (DCMIP) Customizable Data Services…Interfacing with ESGF • • • • • • Search widget can be turned on/off. Search can be narrowed to any ESGF node and to any project (e.g. CMIP). Search facets can be created, deleted, and grouped. Help text can be added to the top of the search page. Search results can be saved to a Data Cart associated with a user. Items in the Data Cart persist. Search results can be: – Forwarded to the Live Access Server (LAS) for simple visualization. – Downloaded directly via a WGET script. – Associated with model metadata if it exists. ESGF Search Customization Data Cart • Items in the Data Cart can be sent individually or collectively to LAS or WGET. • The Data Cart is associated with a user and not a project. Show Metadata Wiki and Collaboration Tools The CoG layout is colorcoded: • The right-hand side (dark yellow) is where services (data, news, project connectivity) are located. • The Upper Navigation bar (dark teal) contains links to project-level metadata. • On the left (light teal) is an auto-generated navigation system created when projects develop freeform content. • The central portion of the site is a wiki that allows projects to create their own content. Screenshot of the CoG project workspace for the 2012 Dynamical Core Model Intercomparison (DCMIP) Workshop. Project Networks and the Project Browser • • • Projects in CoG are arranged in a hierarchy of Parents, Peers, and Children. The Project Browser displays the network and allows for inter-project navigation. Projects can be tagged with keywords and projects can be searched for using keywords. Project-level Metadata Roll-up • • Management of information is a major problem in projects that involve many sub-projects, partners, multiple leads, and many resources. CoG acts as an index into project information that is necessary for coordination and collaboration and enables people responsible for overall coordination to quickly get consolidated views of information. This example shows the Partners feature that allows projects to list their project partners and include a logo for each. Below the list for ED-DOC is a consolidated view of the partners for ES-DOC’s peer projects. CoG Schema Project-level metadata is linked in standardized locations via the upper navigation bar. The CoG schema contains classes to describe software development projects, shortcourses or meetings, and overall project coordination. Projects select which metadata to display via a simple web form. UML Diagram https://earthsystemcog.org/site_media/projects/cog/cog_ontology.png Resources • • • • • Resources are pointers to data, files, and URLs. Resources folders can be created, moved, and deleted. Projects can turn on a set of standardized Resources folders (e.g. Presentations, Minutes). Saved data searches can be saved as a Resource. Each Resource can have a private wiki-based notes page to facilitate discussions. News • News is a way to send announcements across a project network. • News is visible in the news widget on any targeted project. • News will be added to social media (Google+, Facebook, Twitter, RSS) in a future release. Model Metadata Services • The CoG Team is partnering with the international Earth System Documentation (ESDOC) project to develop and use an Earth System Model metadata entry and view capability. • The ES-DOC Viewer is a lightweight JavaScript plugin that will display any Common Information Model (CIM) record. • The ES-Questionnaire collects standardized CIM metadata through a high-customizable web form. The output is saved to a community CIM repository. Future Work • CoG – ESGF Integration (through summer 2014): – CoG is going to replace the ESGF web front end. – CoG will be federated so that projects hosted on one CoG-ESGF instance will be visible on others. – OpenID access added. – Look and feel will be more customizable to meet institutional branding requirements. • Possible other features: – Be able to export the CoG ontology. – Be able to list non-hosted projects the Project Browser. – Be able to version the wiki. – Enable RSS and social networking for news. – Enable non-wiki links in the left navigation bar. Questions? Questions and contacts: cog_support@list.woc.noaa.gov CoG: https://earthsystemcog.org/ ESRL ESGF data node: http://hydra.fsl.noaa.gov/esgf-web-fe/ PCMDI ESGF data node: http://pcmdi9.llnl.gov/esgf-web-fe/ JPL ESGF data node: http://esg-datanode.jpl.nasa.gov/esgf-web-fe/