Research Data Alliance (RDA) Marine Data Harmonization Interest Group (RDA-Marine Data) Case Statement 1. WG Charter The RDA Marine Data Harmonization Interest Group (RDA-MarineData) is a collaborative effort to support, promote, and facilitate the harmonization of ongoing work of pre-existing marine data interoperability efforts while linking their technical activities to emerging RDA WG activities, where and when appropriate. Ocean sciences is becoming increasingly global in scope and there has been a paradigm shift in marine research from the traditional discipline based approach towards further increases in multidisciplinary and ecosystem level research. This approach is necessary to address a number of phenomena both natural and man-made, the impacts of which require researchers from a diverse range of disciplines to work together. A prerequisite for this type of multidisciplinary research is the availability of large volumes of good quality interoperable data which can be easily discovered and accessed. Several data interoperability efforts are currently underway in a variety of oceanographic/marine research organizations/communities around the world. However, communication and coordination amongst these efforts has proved difficult. Facilitating the re-use of marine data is a priority within this domain as the need for datasets increases to support the development of this ecosystem based approach to marine research and multidisciplinary study of topics such as global climate change. RDA-MarineData is a subject-specific science-domain focused interest group, whose members will keep themselves well informed and engaged about other RDA working and interest group approaches that are relevant to marine data and metadata related activities within existing networks of data centres and repositories and other efforts that they represent. This will provide active marine data use-cases, input and feedback from active researchers/stakeholders from RDA member countries to improve the maturity and robustness of the specifications and recommendations of other RDA Working Groups -- while at the same time improving the quality of the science within the marine scientific research domain that is dependent on marine data interoperability. RDA-MarineData has the following objectives: 1. Promote the development of improved, more consistent and more efficient implementation of data interoperability features for existing marine and ocean data interoperability initiatives 2. This, in turn, should result in the availability of a large volume of good quality interoperable data which can be easily discovered and accessed. 3. Engage with the interoperability global marine science community to promote data 4. Help to improve the quality of deliverables (and associated implementation documentation) generated by RDA Working Groups by providing a community of researchers and technologists familiar with RDA working group process and prepared to help test/vet solutions as they emerge from those WGs. 5. Facilitate and promote the adoption and use, by the marine research and data management community, of new and emerging technologies, practices and connections, developed by the RDA that accelerate and facilitate research data sharing and exchange across disciplines. 2. Value proposition 2.1 Key impacts of the RDA Marine Data interest group The coordination of existing, global-scale marine data interoperability efforts will underpin a multidisciplinary, ecosystem level approach to marine research. It will allow users to access best practices for managing a range of marine data types and facilitate wider re-use of existing marine data. To achieve this objective the RDA-MarineData aims to: Promote the development of improved data interoperability characteristics for existing global-scoped marine and ocean data interoperability initiatives Provide RDA WG-based guidance to existing marine data interoperability inititatives to help them to improve discovery of large volumes of good quality interoperable data which can be easily located and accessed. Promote common data standards and tools to enable conformity of data to these standards as they emerge from RDA WGs and are identified as appropriate to IG member intitiatives. Engage with the global marine science community through an active and timely communication campaign to promote the outcomes of RDA. RDA-MarineData activities will have tangible measurable impacts, including: number of active interest/working group members number of documents/exemplars downloaded by users level of interaction with other RDA interest/working groups including dissemination and adoption of appropriate RDA WG outcomes/deliverables within the marine data management community feedback from the ocean research and data (information1) management community (as well as number of contacts in mailing list, if established) 2.2 Individuals, communities and initiatives that will benefit from the RDA Marine Data interest group: Marine scientists will be have improved access to standardised high quality data, in a useable form, from a greater number of sources and use these in combination for the purposes of multidisciplinary marine research. Researchers will also benefit from the application of the RDA working group best practices etc. for various aspects of data management at the repositories holding data sets of interest for their research. Data repository and research infrastructure managers will have access to information, solutions, and a higher level of expertise regarding advanced data interoperability approaches via involvement in existing RDA working groups. Data scientists will be able to deliver data in standardised formats using agreed common standards. This will facilitate the exchange and re-use of marine data in the long-term. Other RDA working groups will be able to benefit from an existing network of data repositories and marine data interopability efforts that are already working in similar areas and are willing to provide experience-based feedback on how easily concepts developed within other RDA working groups could be applied to existing marine data facilities. The MarineData IG will also seek to contribute to relevant activities undertaken by other IG/WGs where there is benefit to the stated objectives of the respective groups. 3. Engagement with existing work in the area Many of the key national and international marine data management initiatives are represented in the initial membership of the marine data group. These initiatives, described below, are actively working on data/metadata sharing services and interoperability technologies in areas that other RDA working groups are focused on (linked-data, community vocabularies, employment of unique identifiers, etc) and will dedicate staff to review the potential adoption of the RDA working group recommendations. Mrine data oriented initiatives represented in the initial membership of the marine data harmonization group are: Rolling Deck to Repository (Bob Arko, USA), WHOI Underwater Ocean Imagery Informatics (Andrew Maffei, USA), NSF Biological and Chemical Oceanography Data Management Office (Cyndy Chandler, USA), Ocean Drilling Program (Doug Fils, USA), ODIP (Helen Glaves, Europe), IMOS (Roger Proctor, Australia), SeaDataNet (Dick Schaap, Europe), IODE (Peter Pissierssens, International) and iMarine (Donatella Castelli, Europe). Below is a short description of these initiatives and their potential contributions to the RDA-MarineData working group effort. 1 It should be noted that, in addition to the ocean data management community, the marine library community provides a substantial contribution to ocean research and data management through eg data citation, metadata management, publication management, etc. ODIP The Ocean Data Interoperability Platform project is a co-funded EU-USA-Australia initiative promoting the development of interoperability between existing regional einfrastructures to support effective sharing and re-use of marine data across scientific domains and international boundaries. The ODIP partnership includes all the major organisations and initiatives engaged in ocean data management in Europe, USA, and Australia, and it is also supported by the IOC/IODE. The project is developing prototypes to evaluate and test selected potential common standards and interoperability solutions. ODIP will contribute directly to the RDA efforts in the area of marine data and also promote the outcomes of a number of the relevant RDA working groups. SeaDataNet (Input needed from Dick Schaap) R2R (input needed from Bob Arko) IMOS, the Australian Integrated Marine Observing System is a federally-funded research infrastructure program. IMOS is designed to be a fully integrated national array of observing equipment to monitor the open oceans and coastal marine environment around Australia, covering physical, chemical and biological variables. All IMOS data is freely and openly available through the IMOS Ocean Portal for the benefit of Australian marine and climate science as a whole. Marine data and information are the main products of IMOS, and data management is therefore a central element to the project's success. The eMarine Information Infrastructure Facility of IMOS provides a single integrative framework for data and information management that allows discovery and access of the data by scientists, managers and the public. Wherever possible recognised (e.g. OGC, ISO) standards are adopted to describe the data and metadata and web service delivery. IODE (input needed from Peter ) The programme "International Oceanographic Data and Information Exchange" (IODE) of the "Intergovernmental Oceanographic Commission" (IOC) of UNESCO was established in 1961. Its purpose is to enhance marine research, exploitation and development, by facilitating the exchange of oceanographic data and information between participating Member States, and by meeting the needs of users for data and information products. The IODE programme has developed a global network (and associated expert community) of 80 National Oceanographic Data Centres (NODCs) in 78 countries. The IODE network has been able to collect, control the quality of, and archive millions of ocean observations, and makes these available to Member States. The IODE embarked on the development of the “IODE Ocean Data Portal” in 2007. It aims at providing seamless access to collections and inventories of marine data from the NODCs (National Oceanographic Data Centres) of the IODE network and allows for the discovery, evaluation (through visualization and metadata review) and access to data via web services. The system architecture use Web-oriented information technologies to access non-homogeneous and geographically distributed marine data and information. The main objective of ODP is to link existing data systems into one global, transparent system. However, where no such national, regional or organizational system exists, ODP can provide the necessary technology and related capacity development support. iMarine (input needed form Donatella) BCO-DMO (input from Cyndy added 9-16) Oceanography is an interdisciplinary field of study that generates and requires access to a wide variety of measurements. In late 2006 the Biological and Chemical Oceanography Sections of the National Science Foundation (NSF) Geosciences Directorate Division of Ocean Sciences (OCE) funded the Biological and Chemical Oceanography Data Management Office (BCO-DMO). Additional funding was contributed in late 2010 to support management of research data from the NSF Office of Polar Programs (PLR) Antarctic Organisms & Ecosystems Program (ANT). The BCO-DMO is recognized in the 2011 Division of Ocean Sciences Sample and Data Policy as one of several program-specific data offices that support NSF OCE funded researchers in the United States. Efforts at BCO-DMO focus on comprehensive data management activities that span the full data life cycle from “proposal through preservation”. The essential data management activities include: (1) working with data management professionals to establish a comprehensive data management plan; (2) registering the project in the BCO-DMO catalog; (3) ensuring reliable backup of data and supporting documentation; (4) providing data access systems that support data discovery, access, display, assessment, integration, and export of data resources; (5) submission of final data sets to the appropriate long-term data archive and (6) formal publication of data sets to provide citable references (Digital Object Identifiers) for publishers of the peer-reviewed literature and to encourage proper citation and attribution of data sets in the future. When combined, these elements comprise the full spectrum of the data life cycle; enabling discovery and accurate re-use and ensuring long-term permanent archive of the data that are an important component of a researcher’s legacy. BCO-DMO staff members work in partnership with NSF-funded investigators from large national programs and medium-sized collaborative research projects, as well as researchers from single investigator awards to ensure that data resulting from their respective research projects are archived at the appropriate US National Data Center. In addition to ensuring final archive of NSF OCE funded research data, efforts undertaken by BCO-DMO data managers foster community building, establishment of trust between collaborative partners, and capacity building through outreach and education efforts. Support is provided at no charge to projects funded by OCE Biology or Chemistry or PLR ANT and available to other investigators for a fee. NSF-funded ocean science researchers in the US have been contributing data from recently funded projects to the BCO-DMO data system, and it has evolved into a rich repository of data from ocean, coastal and Great Lakes research programs. The BCO-DMO data system can accommodate many different types of data including: in situ and experimental biological, chemical, and physical measurements; modeling results and synthesis data products. The system enables reuse of oceanographic data for new research endeavors, supports synthesis and modeling activities, provides "real data" for classroom use, and provides decision-support field data for policy-relevant activities. In addition other more general data initiatives will also be consulted and contribute to the RDA marine data group. These include: DataONE (USA), COOPEUS (Europe), Australian National Data Service (ANDS), Commonwealth Scientific and Industrial Research Organisation (CSIRO, Australia), EUDAT (Europe), EPOS (Europe) and EMODNET-Geology (Europe) The RDA Marine Data IG activities will also seek to contribute to the activities and outcomes from the other RDA work and interest groups where these have direct relevance for the management of marine data. The RDA Marine Data group will also promote the adoption and implementation of the outcomes from the other RDA interest and working groups across the stakeholders in the marine domain. The existing RDA WG and IG groups that the RDA Marine Data Harmonization group has identified as having objectives that are applicable for the management of marine data follow. Our catalog deliverable (see below) would be used to match linkages in existing marine data interoperability efforts to these efforts. 4. RDA Working Groups o Data Citation o Data Foundation and Terminology o Data type Registries o Metadata Standards o PID Information Types o Standardisation of Data RDA Interest Groups o Big Data Analytics o Brokering o Data in Context o Legal interoperability o Metadata standards directory o Preservation e-infrastructure o Publishing data o Engagement Group Work plan Our understanding is that an Interest Group need not have specific 18-month duration deliverables unless it intends to shift itself into an RDA Working Group and that Interest Groups can exist indefinitely in this state. For this reason, you will find the deliverables listed below much reduced (until such time as we may decide to become a WG). a. Milestones and intermediate documentation and deliverables Note PP: taking into account the objectives of the IG and the expected (tangible) outcomes I would suggest the following: 1Activity: Call for experts: disseminate information on the objectives of RDA and identify experts from the marine data (information) management communities who wish to participate in activities (either active or observer) in RDA WGs Deliverable: number of experts identified to cooperate in WGs 2Activity: development of documents/exemplars to be made available for testing/use by marine data (information) management communities. Deliverable: documents/exemplars made available for testing/use by marine data (information) management communities. 3- Activity: reporting back to RDA WGs on experience with exemplars Deliverable: report 4Activity: submission of tested/approved technologies, practices and connections, developed by the RDA that accelerate and facilitate research data sharing and exchange across disciplines, to marine data (information) management communities as well as marine research community for community wide adoptions. Deliverable: Reports, manuals and guides (somewhere in here we should also develop and implement a community wide communication strategy) Overall I would say that our biggest priority should be to disseminate information on what is happening within RDA and how our data.information management communities can interact with, contribute to, and benefit from the RDA activities. This means that “someone” will need to take responsibility for actively and regularly communicating with the communities that we wish to reach. Who will do this? We probably need to set up a mailing list to which people can subscribe. Activity Deliverable An initial group of marine data / metadata IG Membership list showing names of providers and data interoperability individuals and marine data related 1 technologists will be invited to join the RDA- initiatives they represent. MarineData Interest Group. The RDA-MarineData IG members will be asked to provide details about the data interoperability/sharing efforts that they have undertaken in the past or are currently undertaking and to list the deliverables that 2 these efforts have produced along with the technologies (controlled vocabularies, linked data, permanent identifiers, etc.) represented in them. A regularly updated, online, catalog of MarineData IG member’s marine data/metadata interoperability efforts and the products they have produced will be created. Note that this might be a spreadsheet or make use of the RDA Community Capability Model WG tools used for a similar purpose. 3 During plenary meetings those IG members MarineData IG plenary meeting notes who attend will review overlapping and gap areas in the marine data/metadata interoperability efforts they are undertaking in the marine/ocean data community. will include a section that lists the status of collaborative actions that IG member organizations decide to undertake together to coordinate their data interoperability efforts. This might include tasks like sharing staff to work on implementing new RDA WG recommendations that might be mutually beneficial. The IG members will identify potentially beneficial linkages between data interoperability efforts in the 4 marine/oceanography community and pertinent efforts within the RDA WGs. This will primarily be done during MarineData IG plenary meeting notes will include a section that reports on the involvement of MarineData IG members in existing RDA WGs since the prior plenary and the technologies they are working on implementing. b. Final deliverables No final deliverables are listed in this version of the charter. All deliverables are listed in section a. c. Mode and frequency of operation Communication by e-mail as often as needed with a bi-weekly check in on progress as a minimum. Bi-monthly virtual meetings via Skype and/or teleconferencing. Use of online collaboration tools (e.g., Google Drive), file-sharing systems, wikis, and other electronic means of asynchronous communication. Additional communication sessions will be scheduled as needed. d. Group dynamics and management The RDA-MarineData is modelled upon three principles that ensure that the group makes progress, resolves conflicts, stays on track and within scope: Rough consensus - the dominant view prevails; the chairs have the right to identify the dominant view. Actionable draft - a practically oriented document that describes goals, implementation steps and deadlines is sufficient for the group or individual members to begin their work. e. Open review - any member of the group as well as other RDA members may request further discussion or reconsideration of an issue. Broader community engagement and participation Initially (over the next 6 months) the efforts of the RDA Marine Data interest group will focus on identifying a set of “first-adopter” marine data repositories, researchers and projects that are willing to work with the RDA-MarineData Working Group in order to examine candidate standards, specifications, etc. for adoption and implementation. This will then be followed by a further 6 month period during which a plan to raise awareness and the impact of the RDA Marine Data WG across the wider marine science data community. During this period the outcomes of the RDA Marine Data WG and other relevant WGs will be promoted to the user communities. f. Initial membership Interest Group Co-chairs: Helen Glaves (NERC-BGS, UK, hmg@bgs.ac.uk), Peter Pissierssens (IODE, International, Roger Proctor (IMOS, Australia (roger.proctor@utas.edu.au), Donatella Castelli Enrique Alonso Garcia Andrew Maffei (Woods Hole Oceanographic Institution, USA) Initial membership: Steve Miller, Scripps Institute of Oceanography, USA Bob Arko, R2R, US Others who expressed interest in being part of the group once it is established: Andrew Treloar (ANDS, Australia) Cyndy Chandler ( BCO-DMO, USA) Jay Pearlman (IEEE, USA) Dick Schaap (MARIS, Netherlands) Lesley Wyborn (Geosciences Australia)