Maintaining Long-term Access to Geospatial Data Digital Curation Centre (DCC) Workshop Go-Geo! UK academic geospatial metadata standards and formats Tony Mathys, Metadata Officer EDINA National Data Centre University of Edinburgh http://edina.ac.uk/ Overview • EDINA national data centre • Geospatial metadata • Go-Geo! resources • Go-Geo! resources support role in maintaining long-term access to geospatial data • Current EDINA metadata support activities • Conclusions EDINA • A National Data Centre for Tertiary Education since 1995 – based at the University of Edinburgh Data Library • The EDINA mission... to enhance the productivity of research, learning and teaching in UK higher and further education • Focus is service but also undertake research and development projects to services • Major content provider within the academia • Strategic move toward interoperability & shared services role • Substantial experience in handling and delivering key spatial data and geo-referenced information Spatial data and information delivery services Digimap agcensus UKBORDERS Experiences in geospatial metadata • Beginning in the 1980s, more than 25 years experience with geospatial metadata initiatives, policies, projects and services – – – – – – – – – – – ESRC Computer files cataloguing group (1980s) Register of spatially referenced data for Scotland (1991) “Metadata in the Geosciences” (published 1991) Global Environmental Network for Information Exchange (GENIE) 1990s Rawa Taio – environmental metadata service (NZ, 1996) MetroGIS, Minneapolis/Saint Paul Metropolitan Organisation for promoting metadata creation and spatial data sharing (1998) “The Minnesota Metadata Mission” in GeoInfoSystems (published 1999) State representative on ANZLIC Metadata Working Group Advisors to AskGiraffe and now hosting GIgateway service UK GEMINI (Geo-spatial Metadata Interoperability Initiative) Creation of Go-Geo! portal and metadata creation resources So what is METADATA? The word appears to be of Greek and Latin origin? Conjures up images of Ancient Greece and Rome, the Mediterranean Sea, sun and holiday Photographic Images copyright: Jupiter Images 2006 but metadata represents something completely different…… and it’s not sun and holiday Photographic Images copyright: Jupiter Images 2006 Metadata (data describing data) represents a documented and ordered summary of information that describes something, in this case, a spatial dataset. The description includes the What, Who, Where, When, and Why of a dataset, plus access and use conditions. Think of metadata as a recipe for making ale What are the ingredients? What are the brewing steps? Who sells the ingredients? Where can you buy them? When is the fermentation completed? Photographic Images copyright: Jupiter Images 2006 Why make ale? Maybe an incentive to encourage metadata creation? Geospatial metadata standards Started with Dublin Core (ISO 15836), but only 15 elements • Federal Geographic Data Committee (FGDC) Content Standard for Digital Geospatial Metadata (CSDGM) introduced in mid 1990s to document spatial datasets • ISO 19115 Metadata Standard for Geographic Information was ratified in 2003 and will replace FGDC • UK GEMINI, ratified in 2004, represents the new geospatial metadata standard for the UK. Supersedes the National Geospatial Data Framework (NGDF) and GIgateway Metadata Guidelines • UK GEMINI was created to be ISO 19115 compliant Problem is that geospatial metadata standards have too many elements (300+) Geospatial metadata application profiles • An application profile is derived from a standard and represents a reduction of the number of entities and elements • It should include elements that are best suited for a working group’s specific applications • A profile can be extended as well to meet the requirements of a working group • A core element set should be considered as a first step towards creating a metadata application profile • Core element set should be inclusive to ensure interoperability across the wider geospatial community • Europeans and North Americans developing their own profiles for ISO 19115 Why metadata? • two decades of GIS and spatial data capture technology • an eclectic range of academic disciplines uses GIS as a research and teaching tool • hence, considerable cost and time invested in spatial data creation • requires spatial data management and sharing solutions delivered through tools supporting the documentation of datasets So why bother documenting a spatial dataset? Can you tell me … • Where did it originate? Where is the study area represented in the data? • When were the data collected? • What is its purpose? What is the spatial reference system? -spatial accuracy? -type of application? -processes or algorithms used to create it? • • • Who can supply the data? • What do these polygons represent? What attribute information does the dataset contain? • What does this attribute mean? What do these SOILCLASS attribute values mean? These questions can be answered in one metadata record Go-Geo! metadata resources for UK academia • geospatial metadata application profile (AGMAP) and supporting guidelines • metadata creation and editing tool • a portal for search and discovery of spatial datasets and other GI resources UK Academic Geospatial Metadata Application Profile AGMAP UK AGMAP - an application profile created to support the specific needs of the UK academic community - contains 97 elements categorised and separated under seven entity groups and subgroups G1 Citation G2 Identification Information ‘What’ G3 Data Quality Statement G4 Dataset Extents ‘Where and When’ G5 Custodian ‘Who’ G6 Distributor ‘Access’ G7 Metadata Creator Information - UK AGMAP based on current geospatial standard AGMAP includes ISO 19115 and UK GEMINI elements UK GEMINI ISO 19115 2003 53 elements AGMAP 2004 43 elements 53 + 43 + 1= 97 elements *AGMAP originally included elements from the FGDC standard *Mapped to Dublin Core *FGDC *Data Documentation Initiative (DDI) *e-Government Metadata Standard AGMAP mandatory elements 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. Dataset Name Creator Edition Dataset Code Date (L) Dataset Event Date Dataset Update Frequency (L) Dataset Language (L) Dataset Topic (L) Controlled Vocabulary Controlled Keywords (L) Description Spatial Reference System (L) Spatial Reference System used for the bounding rectangle or bounding polygon (L) West bounding co-ordinate East bounding co-ordinate North bounding co-ordinate South bounding co-ordinate Nations (L) 19. Name of Custodian 20. Postal Street Address of Custodian 21. Postal City of Custodian 22. Postal Code of Custodian 23. Postal Country of Custodian 24. Name of Distributor 25. Postal Street Address of Distributor 26. Postal City of Distributor 27. Postal Code of Distributor 28. Postal Country of Distributor 29. Dataset Format Name 30. Dataset Version Name 31. Name of Metadata Creator 32. Postal Street Address of Metadata Creator 33. Postal City of Metadata Creator 34. Postal Code of Metadata Creator 35. Postal Country of Metadata Creator 36. Metadata Last Updated Metadata creation resources Photographic Images copyright: Jupiter Images 2006 Most spatial data information (metadata) is stored in our heads. We need to move it from there to electronic files. Go-Geo! Metadata Editor tool UK AGMAP guidelines Contain material and examples to assist metadata creators in creating quality UK AGMAP compliant records Once created, metadata records can be validated and stored in a personal, secure directory or submitted for publication on the Go-Geo! portal. Go-Geo! Portal a simple interface designed for UK academia to run queries to discover spatial datasets. The portal enables searching by the use of various options including -free text -date -resource type -geographic location Go-Geo! simple search Go-Geo! advanced search Data type Location Text Date cross-searches the internet and harvests metadata records from other spatial data portals User Portal Other Content Providers Geo-data Gateway NGDF/GIgateway Network Local Go-Geo! database and returns metadata records to the user to access, evaluate the information and acquire the data Go-Geo! role in maintaining long-term access to spatial data AGMAP profile establishes and sustains internal harmonisation within Academia while maintaining links to the greater GI community AGMAP guidelines provide an important reference to ensure quality and maintain format consistency Metadata creation tool (AGMAP template) -ensures consistency across all metadata records created -eliminates or reduces the risk of redundancy in data collection or deletion of existing datasets -a tracking mechanism to monitor changes and edits to datasets *A data management resource for protecting the longevity of a dataset Go-Geo! Portal -a repository to store and manage metadata -announce datasets and applications to potential users -exposes datasets to interested parties in academia and other sectors *A data management and sharing resource for supporting the longevity of a dataset Steps to data immortality Fit for purpose? Discover Locate Access Use Publish Preserve Current EDINA metadata support activities Go-Geo! Portal – phase 4b • JISC funded, 18 month project • ‘Metadata workshops’ 24 workshops have been scheduled since 2003. Training intended to change mindsets and encourage metadata creation for data management and sharing. Workshops integral to longevity of a dataset as these encourage data developers to use Go-Geo! resources to create and publish metadata. This is the first critical step towards protecting and sustaining spatial data. A foundation for planning long-term access strategies to spatial data. A pilot study with 4 universities to establish a business model for metadata creation and data maintenance based on the use of Go-Geo! resources as local data management tools. *400+ datasets revealed in audits at 4 institutions 100s more orphan datasets. • Local geospatial metadata management pilot study scheme University A University A -AGMAP -Guidelines -Metadata tool -Customised Go-Geo! portal nodes -Training Geography Go-Geo! Archaeology Go-Geo! Geological Sciences Go-Geo! Biological Sciences Spatial data repository GRADE • JISC funded project, 18 months • Looking at utility of geospatial data repositories for storing and sharing of geospatial data • Comparing thematic v. institutional v. informal • Compendium of use cases of intended data sharing • Assess interoperability aspects of geospatial data repositories Spatial data global service for UK academia University A HFE Application Profile Guidelines Metadata tool Customised Go-Geo! Portal Nodes Training Spatial Data Repository Geography University C HFE Application Profile Guidelines Metadata tool Customised Go-Geo! Portal Nodes Training Geography Go-Go! Go-Go! Archaeology Archaeology Go-Geo! Go-Geo! Geological Sciences Geological Sciences Go-Geo! Go-Geo! Biological Sciences Biological Sciences Spatial data User Metadata Go-Geo! Search University B HFE Application Profile Guidelines Metadata tool Customised Go-Geo! Portal Nodes Training Geography Go-Go! Archaeology Go-Geo! Geological Sciences Go-Geo! University D Other resources and portals HFE Application Profile Guidelines Metadata tool Customised Go-Geo! Portal Nodes Training Geography Go-Go! Archaeology Go-Geo! Geological Sciences Go-Geo! Biological Sciences Biological Sciences Conclusions No documented data? No data management or data sharing? No long-term data access issue! Challenges: • Motivating people to document datasets – seen as onerous task and left undone – we were saying this in ’80s and situation no better now • Difficult to fully automate – requires human interpretation • It’s a people and organisational problem – also concerns about IPR, copyright and mechanisms for sharing • We also need to understand better the life cycles of data and metadata as they are disseminated across the academic community - authorship of data and metadata as data are merged, generalised, augmented, new data derived, new editions published - tracking and recording digital rights as this happens Contact details Tony Mathys Go-Geo! Project tony.mathys@ed.ac.uk Tel.: +44 (0)131 651 1443 Fax: +44 (0)131 650 3308 EDINA web site: http://edina.ac.uk Go-Geo!: www.gogeo.ac.uk Questions?