EMODnet Chemistry 2 Service Contract MARE/2012/10 S12.656742 Technical developments of the EMODNet Chemistry portal By Dick M.A. Schaap – Technical Coordinator Split – Croatia, 19 June 2014, 1st Experts Workshop EMODNet Chemistry portal The EMODNet Chemistry portal sits on top of the SeaDataNet infrastructure and makes use of its services that have been adapted and are further developed for specific EMODNet Chemistry needs. It provides various services and functionalities to users for browsing and viewing the chemistry data products and for identifying and requesting access to the gathered chemistry data sets for the European waters. The primary services are: CDI Data Discovery and Access Service giving facilities for searching and retrieving chemistry data sets; OceanBrowser Viewing Service giving facilities for viewing, browsing and downloading Chemistry data products; Sextant Products metadata catalogue giving facilities for searching Chemistry data products and linking to the viewing service. CDI service for discovery and unified access of data CDI Data Discovery and Access Service Based up ISO19115 content standard and ISO 19139 XML Schema, fully INSPIRE compliant Dedicated tools and facilities for generating CDI entries, format associated data sets and populating the CDI service Shopping basket mechanism for discovery, access request, and downloading of data sets from distributed data centres Downloading in harmonised SeaDataNet formats: SDN ODV ASCII, and soon SDN NetCDF (CF) Adopted in many projects and ongoing improvements Operational governance scheme Chemistry CDI Data Discovery and Access Service Dedicated CDI service for EMODNet Chemistry scope Three search interfaces for human users: Quick Search with dynamic drilling down of search results; Extended Search with more flexibility for combining search options, including free search; Variables Vs Marine Regions with an interactive Matrix of variables in specific marine regions. Chemistry CDI Service upgrading ISO 19115 content standard and XML Schema have been migrated to INSPIRE compliant ISO 19139 Extra fields included: EDMED references CSR references Publication references Quality Info references Vocabularies NVS 2.0 upgrading Sea regions (C19) references via geo-tagging User interface also upgraded to support new elements and extra search options Shopping basket extended from 500 to 10.000 CDI’s requests Chemistry CDI Service upgrading – search interface Extra P02 Extra Sea Region Extra Multiple Extra Duration Chemistry CDI Service – filter EMODNet Chemistry 2 project has an extended scope of chemical substances: Pesticidides and biocides Antifoulants Pharmaceuticals Heavy metals Hydrocarbons Radionuclides Fertilisers Organic matter(e.g. from sewers or mariculture) Chlorophyll Partial pressures of dissolved gases Acidity (from pH, pCO2, Total Inorganic Carbon, alkalinity) Others Mapping between new EMODNet Chemistry 2 groups and P02 Vocabulary done Sets filter for virtual subset of SeaDataNet CDI to EMODNet Chemistry CDI collection Chemistry CDI Service – present coverage (16 June 2014) 661095 CDI records and data sets Chemistry CDI Service – present coverage (16 June 2014) European waters (N80 W-30; N20 E45) 587538 CDI records and data sets 62 Data Centres 31 Countries 248 Originators 1868 – 2014 years 82% unrestricted 18% to be negotiated Chemistry CDI Service – WMS / WFS services The CDI locations and related metadata can be shared with other portals by means of OGC WMS (Web Mapping Services) and WFS (Web Feature Services) http://geoservice.maris2.nl/wms/seadatanet/cdi_v2/emodnet/chemistry Further developing controlled vocabularies for EMODNet Chemistry As part of SeaDataNet controlled vocabularies (NVS 2.0) are maintained and served by NERC BODC as web services for marking up all metadata and data entries. At present more than 160 vocabulary lists are served with more than 150.000 concepts and with established and active gouvernance The parameter usage vocabulary list P01 is used for data sets while the parameter discovery vocabulary P02 is used for the CDI metadata. P01 are narrower terms of P02. At present P01 already contains more than 30.000 concepts. Therefore activities are undertaken in EMODNet Chemistry for: New entries because of the extended scope of substances Making mapping to P01 easier and more efficient by exposing the semantic model behind P01 and making it retrievable by components Grouping various P01 terms under an aggegrated term in a new vocabulary P35 that will facilitate data aggegration and product labelling Further developing controlled vocabularies for EMODNet Chemistry Example of P01 term: ‘Concentration of tributyltin cation {tributylstannyl TBT+ CAS 36643-284} per unit dry weight of biota {Mytilus galloprovincialis (ITIS: 79456: WoRMS 140481) [Subcomponent: flesh]}’ Semantic model: Discovery and harvesting of data sets for regional product groups EMODNet Chemistry 2 project is generating chemistry products per sea region (as defined by MSFD (draft)) This requires discovery and harvesting of data sets per MSFD region for specific chemical parameters In EMODNet Chemistry 1 this was done on a manual basis using the CDI Discovery and Shopping mechanism In EMODNet Chemistry 2 this was done initially in a semi-automatic way and progress is made towards an almost full automatic method: using a Robot harvester via the shopping mechanism to discover and retrieve specific data sets from distributed data centres to compile and maintain specific aggregate data sets as internal central data buffers that can be transferred to regional groups for further processing and products generation. Discovery and harvesting of data sets for regional product groups – 1st year Filter set to discover nutrients data and to MSFD regions (approx.) Boundaries of regions approximated by VLIZ and then schematised by MARIS with extra margins as GEO-filter Discovery and harvesting of data sets for regional product groups – 1st year Robot harvester has gathered circa 440.000 CDI records and data sets for nutrients in the given regions These sets were transferred to the EMODNet regional groups per region Discovery and harvesting of data sets for regional product groups – now almost automatic An online Buffer Content Management System (Buffer CMS) has been developed and tested for configuring specific data buffer profiles in agreement (SLA’s) with data providers AND for specific data user communities (such as EMODNet Chemistry regional product groups, MyOcean, SeaDataNet regional dataproduct groups, …) Configuration settings concern discovery filter, buffer group, motivation, users (by means of SeaDataNet AAA services) Robot harvester can be activated to perform retrieval for each buffer profile and also to maintain the central metadata and data buffers automatically for new entries and updates of existing entries Progress of the robot harvesters is administered in the existing online Request Status Manager (RSM) system which is part of the CDI Shopping mechanism for tracking and tracing requests by users, data providers and overall Buffer Content Management System (Buffer CMS) logon Overview of buffers Configuring profile of specific buffer Buffer CMS + Central User Interfacing API Robot CMS to configure Robot harvesting profiles MARIS master CDI User Interface + Shopper: Access regulated via AAA CDI User CDI Robot Interfaces harvester Agreed Settings Dynamic Maintenance Specific data buffers RSM system extended with administering robot transactions and via central interfaces Central User Interface with logon (AAA service) following authorisation in buffer CMS profiles logon Overview of authorized buffers Central buffer UI incl direct shopping Discovery and harvesting of data sets for regional product groups The new Buffer CMS and Central buffer UI and API (under development for full machine-to-machine interaction) together with the new central shopping mechanism and upgraded RSM will greatly facilitate the maintenance of central buffers and regular delivery of data sets incl metadata to the Chemistry regional groups The Central shopping mechanism works on the data Buffers and can deliver (in delayed mode) large data sets which are divided over downloadable zip files with maximum 10.000 data sets each; all shopping transactions are administered in new section of the RSM REMARK: The central buffers are exclusive for specific applications and access is secured via AAA service only for authorised users. These buffers do not replace the distributed CDI infrastructure and its shopping process for regular users. Request Status Manager (RSM) service extended with administering of Central buffer interfaces Logon as user/provider/master New functions for central buffer shopping Converting buffer data sets to validated aggregated data sets However the central buffers will contain and deliver ‘raw’ data sets for specific parameters and as harvested from the distributed data centres => further action is needed for making the collection more homogeneous and validated => aggregated data sets Aggregation and validation for generating homogeneous data collections can be done by using ODV software and specific expertise per region and chemical substance Use will also be made of the new P35 Vocabulary for aggregating P01 terms. The P35 population is making progress: http://seadatanet.maris2.nl/v_bodc_vocab_v2/welcome.asp The final buffers of aggregated and validated data sets will provide the input for data products and advanced visualisation services EMODnet Chemistry extension DIVA + OceanBrowser service Oceanotron & OpenEarth services QCd data buffers Regional experts CDI Robot Harvester ODV QC + aggregation Specific data buffers Advanced services The validated and aggregated buffers of data sets will be input for the products and advanced services: Interpolated maps as produced with DIVA software (Ulg) Time series graphics for selected stations via WPS (Deltares) Data distribution in time plots via WPS (Deltares) OceanBrowser viewer (Ulg) as common service for viewing the DIVA maps and giving access to the data distribution and station time series graphics OceanBrowser Viewing Service (Ulg) The viewer provide access to the DIVA interpolated maps. Output images are available as horizontal sections and vertical sections. The latter can be selected by drawing an appropriate transect; inclusion of predefined coastal sections Data distribution plots in time and station time series via WPS (Deltares) To be integrated in OceanBrowser service Sextant Products metadata catalogue (IFREMER) Used to describe the Chemistry data products such as DIVA maps. This facilitates searching for specific data products and the exchange and use of the Chemistry data products in other services, such as the Chemistry OceanBrowser, and other portals with OGC WMS support. Metadata format: ISO19115 - ISO19139 with SDN Controlled Vocabs; CSW service based upon GeoNetWork Exploring cloud hosting and computing Exploring options for hosting of central aggregrated buffer data sets and applications in a cloud hosting and computing environment Cloud hosting as neutral environment and highly performing Dialogue with Cineca about options and possible way forward. Cineca is a non profit Consortium, made up of 69 Italian universities, and 3 Institutions, including OGS and CNR. It is the largest Italian computing centre. Cineca is also partner in EUDAT, an FP7 project towards a panEuropean Collaborative Data Infrastructure which will allow researchers to share data within and between communities and enable them to carry out their research effectively. www.emodnet-chemistry.eu