Environmental Information Data Centre: enabling the discovery of CEH-held data John Watkins Deputy Director EIDC CEH & NERC in a UK Government setting CEH monitoring and data collection • As diverse as our science • Micro- to macro-scale • Many sources: • Monitoring campaigns • 180+ field sites • State-of-the-art facilities • Regulator networks • Volunteers • Model outputs • Long-term and unique 10µm River Lambourn, Boxford CEH data coordination – in partnership Land Cover Map NRFA EIDC Data Hub Users Web Access UK Gov Catalogue NERC Catalogue CEH Information Gateway View & download (data access) Metadata catalogue (data discovery) Network Linked data and integration Long-term Storage and Curation EIDC Data Hub National River Flow Archive Environmental Change Query & visualisation tools Data Transfer Process NERC Environmental Bioinformatics Centre Biological Records Centre Other Data CEH data CEH data NERC Designated Data Centre Data gateway.ceh.ac.uk Links to NERC Data Catalogue Service Links to UK Government Portal Links to European INSPIRE Portal Data citation via the Data Hub “.....the data have been allocated a digital object identifier (http://dx.doi.org/10. 5285/1a91c7d1-ec44-4858-9af2-98d80f169bbd).” Harmonising data definitions CEH Analytical Services Thesaurus (CAST) No specified vocabulary! Making definitions open access CEH Analytical Services Thesaurus (CAST) • Created to Simple Knowledge Organization System (SKOS) W3C standard • Designed to describe whole process • Top concepts: •determinands •machine descriptions •measurement units •methods •filtration •preservation Importing definitions through Web links Importing information through Web links Resource oriented discovery CEH Analytical Services Thesaurus (CAST) • SKOS allows links to externally hosted vocabularies e.g. ChEBI • adds further value to datasets tagged using CAST, as they can be integrated with datasets tagged using concepts from linked vocabularies Linking ecological concepts Linking ecological concepts Linking to multilingual definitions Linking to multilingual definitions Enabling complex environmental queries Web as a research data resource Issues & challenges • Researchers can ask complex questions across diverse data sources using LOD • How to incentivise data providers to document & tag data => buy-in (e.g. DOIs)! • Tools to automate the process, tagging at source/time of creation (e.g. LIMS) • Automating the creation of semantic information for legacy data using diverse information sources (e.g. text mining of past reports and science papers) Thank you!