Maintaining Long-term Access to Geospatial Data Digital Curation Centre (DCC) Workshop

advertisement
Maintaining Long-term Access to
Geospatial Data
Digital Curation Centre (DCC) Workshop
Go-Geo! UK academic geospatial metadata
standards and formats
Tony Mathys,
Metadata Officer
EDINA National Data Centre
University of Edinburgh
http://edina.ac.uk/
Overview
•
EDINA national data centre
•
Geospatial metadata
•
Go-Geo! resources
•
Go-Geo! resources support role in maintaining
long-term access to geospatial data
•
Current EDINA metadata support activities
•
Conclusions
EDINA
•
A National Data Centre for Tertiary Education since
1995
– based at the University of Edinburgh Data Library
•
The EDINA mission...
to enhance the productivity of research, learning and teaching
in UK higher and further education
•
Focus is service but also undertake research and
development projects to services
•
Major content provider within the academia
•
Strategic move toward interoperability & shared
services role
•
Substantial experience in handling and delivering key
spatial data and geo-referenced information
Spatial data and information delivery services
Digimap
agcensus
UKBORDERS
Experiences in geospatial metadata
•
Beginning in the 1980s, more than 25 years experience
with geospatial metadata initiatives, policies, projects and
services
–
–
–
–
–
–
–
–
–
–
–
ESRC Computer files cataloguing group (1980s)
Register of spatially referenced data for Scotland (1991)
“Metadata in the Geosciences” (published 1991)
Global Environmental Network for Information Exchange
(GENIE) 1990s
Rawa Taio – environmental metadata service (NZ, 1996)
MetroGIS, Minneapolis/Saint Paul Metropolitan Organisation for
promoting metadata creation and spatial data sharing (1998)
“The Minnesota Metadata Mission” in GeoInfoSystems
(published 1999)
State representative on ANZLIC Metadata Working Group
Advisors to AskGiraffe and now hosting GIgateway service
UK GEMINI (Geo-spatial Metadata Interoperability Initiative)
Creation of Go-Geo! portal and metadata creation resources
So what is METADATA?
The word appears to be of Greek and Latin origin?
Conjures up images of Ancient Greece and Rome,
the Mediterranean Sea, sun and holiday
Photographic Images copyright: Jupiter Images 2006
but metadata represents something completely different……
and it’s not sun and holiday
Photographic Images copyright: Jupiter Images 2006
Metadata (data describing data)
represents a documented and ordered summary of information that
describes something, in this case, a spatial dataset.
The description includes the What, Who, Where, When, and Why of a
dataset, plus access and use conditions.
Think of metadata as a recipe for making ale
What are the
ingredients?
What are the
brewing steps?
Who sells the
ingredients?
Where can you
buy them?
When is the
fermentation
completed?
Photographic Images copyright: Jupiter Images 2006
Why make ale?
Maybe an incentive to encourage
metadata creation?
Geospatial metadata standards
Started with Dublin Core (ISO 15836), but only 15 elements
•
Federal Geographic Data Committee (FGDC) Content Standard
for Digital Geospatial Metadata (CSDGM) introduced in mid
1990s to document spatial datasets
•
ISO 19115 Metadata Standard for Geographic Information was
ratified in 2003 and will replace FGDC
•
UK GEMINI, ratified in 2004, represents the new geospatial
metadata standard for the UK. Supersedes the National
Geospatial Data Framework (NGDF) and GIgateway Metadata
Guidelines
•
UK GEMINI was created to be ISO 19115 compliant
Problem is that geospatial metadata standards have too many
elements (300+)
Geospatial metadata application profiles
•
An application profile is derived from a standard and represents
a reduction of the number of entities and elements
•
It should include elements that are best suited for a working
group’s specific applications
•
A profile can be extended as well to meet the requirements of a
working group
•
A core element set should be considered as a first step towards
creating a metadata application profile
•
Core element set should be inclusive to ensure interoperability
across the wider geospatial community
•
Europeans and North Americans developing their own profiles for
ISO 19115
Why metadata?
•
two decades of GIS and
spatial data capture technology
•
an eclectic range of academic
disciplines uses GIS as a research
and teaching tool
•
hence, considerable cost and
time invested in spatial data
creation
•
requires spatial data management
and sharing solutions delivered
through tools supporting the
documentation of datasets
So why bother documenting a spatial dataset?
Can you tell me …
•
Where did it originate?
Where is the study area represented
in the data?
•
When were the data collected?
•
What is its purpose?
What is the spatial reference system?
-spatial accuracy?
-type of application?
-processes or algorithms used to
create it?
•
•
•
Who can supply the data?
•
What do these polygons represent?
What attribute information does the
dataset contain?
•
What does this
attribute mean?
What do these SOILCLASS
attribute values mean?
These questions can be answered
in one metadata record
Go-Geo! metadata resources for UK academia
•
geospatial metadata application profile
(AGMAP) and supporting guidelines
•
metadata creation and editing tool
•
a portal for search and discovery of
spatial datasets and other GI resources
UK Academic Geospatial Metadata Application Profile
AGMAP
UK AGMAP
- an application profile created to support the specific needs of the UK
academic community
- contains 97 elements categorised and separated under seven entity
groups and subgroups
G1 Citation
G2 Identification Information ‘What’
G3 Data Quality Statement
G4 Dataset Extents ‘Where and When’
G5 Custodian ‘Who’
G6 Distributor ‘Access’
G7 Metadata Creator Information
- UK AGMAP based on current geospatial standard
AGMAP includes ISO 19115 and UK GEMINI elements
UK
GEMINI
ISO 19115
2003
53
elements
AGMAP
2004
43
elements
53 + 43 + 1= 97
elements
*AGMAP originally included elements from the FGDC standard
*Mapped to Dublin Core
*FGDC
*Data Documentation Initiative (DDI)
*e-Government Metadata Standard
AGMAP mandatory elements
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
Dataset Name
Creator
Edition
Dataset Code Date (L)
Dataset Event Date
Dataset Update Frequency (L)
Dataset Language (L)
Dataset Topic (L)
Controlled Vocabulary
Controlled Keywords (L)
Description
Spatial Reference System (L)
Spatial Reference System used
for the bounding rectangle or
bounding polygon (L)
West bounding co-ordinate
East bounding co-ordinate
North bounding co-ordinate
South bounding co-ordinate
Nations (L)
19. Name of Custodian
20. Postal Street Address of
Custodian
21. Postal City of Custodian
22. Postal Code of Custodian
23. Postal Country of Custodian
24. Name of Distributor
25. Postal Street Address of
Distributor
26. Postal City of Distributor
27. Postal Code of Distributor
28. Postal Country of Distributor
29. Dataset Format Name
30. Dataset Version Name
31. Name of Metadata Creator
32. Postal Street Address of
Metadata Creator
33. Postal City of Metadata Creator
34. Postal Code of Metadata Creator
35. Postal Country of Metadata
Creator
36. Metadata Last Updated
Metadata creation resources
Photographic Images copyright: Jupiter Images 2006
Most spatial data information (metadata)
is stored in our heads.
We need to move it from there to
electronic files.
Go-Geo! Metadata Editor tool
UK AGMAP guidelines
Contain material and examples to assist metadata creators in
creating quality UK AGMAP compliant records
Once created, metadata records can be validated and stored
in a personal, secure directory or submitted for publication
on the Go-Geo! portal.
Go-Geo! Portal
a simple interface designed
for UK academia to run
queries to discover spatial
datasets. The portal
enables searching by the
use of various options
including
-free text
-date
-resource type
-geographic location
Go-Geo! simple search
Go-Geo! advanced search
Data type
Location
Text
Date
cross-searches the internet and harvests metadata records from other
spatial data portals
User
Portal
Other
Content
Providers
Geo-data
Gateway
NGDF/GIgateway
Network
Local Go-Geo!
database
and returns metadata records to the user
to access, evaluate the information and acquire the data
Go-Geo! role in maintaining long-term access to spatial data
AGMAP profile establishes and sustains internal harmonisation within
Academia while maintaining links to the greater GI community

AGMAP guidelines provide an important reference to ensure quality
and maintain format consistency
 Metadata creation tool (AGMAP template)
-ensures consistency across all metadata records created
-eliminates or reduces the risk of redundancy in data collection or
deletion of existing datasets
-a tracking mechanism to monitor changes and edits to datasets
*A data management resource for protecting the longevity of a
dataset

Go-Geo! Portal
-a repository to store and manage metadata
-announce datasets and applications to potential users
-exposes datasets to interested parties in academia and other sectors
*A data management and sharing resource for supporting
the longevity of a dataset
Steps to data immortality
Fit for
purpose?
Discover
Locate
Access
Use
Publish
Preserve
Current EDINA metadata support activities
Go-Geo! Portal – phase 4b
•
JISC funded, 18 month project
•
‘Metadata workshops’
24 workshops have been scheduled since 2003. Training
intended to change mindsets and encourage metadata
creation for data management and sharing.
Workshops integral to longevity of a dataset as these
encourage data developers to use Go-Geo! resources to create
and publish metadata. This is the first critical step towards
protecting and sustaining spatial data.
A foundation for planning long-term access strategies to
spatial data.
A pilot study with 4 universities to establish a business model
for metadata creation and data maintenance based on the use
of Go-Geo! resources as local data management tools.
*400+ datasets revealed in audits at 4 institutions
100s more orphan datasets.
•
Local geospatial metadata management pilot study scheme
University
A
University A
-AGMAP
-Guidelines
-Metadata tool
-Customised Go-Geo!
portal nodes
-Training
Geography
Go-Geo!
Archaeology
Go-Geo!
Geological Sciences
Go-Geo!
Biological Sciences
Spatial data repository
GRADE
• JISC funded project, 18
months
• Looking at utility of
geospatial data repositories
for storing and sharing of
geospatial data
• Comparing thematic v.
institutional v. informal
• Compendium of use cases of
intended data sharing
• Assess interoperability
aspects of geospatial data
repositories
Spatial data global service for UK academia
University A
HFE Application Profile
Guidelines
Metadata tool
Customised Go-Geo! Portal
Nodes
Training
Spatial Data
Repository
Geography
University C
HFE Application Profile
Guidelines
Metadata tool
Customised Go-Geo! Portal
Nodes
Training
Geography
Go-Go!
Go-Go!
Archaeology
Archaeology
Go-Geo!
Go-Geo!
Geological Sciences
Geological Sciences
Go-Geo!
Go-Geo!
Biological Sciences
Biological Sciences
Spatial data
User
Metadata
Go-Geo!
Search
University B
HFE Application Profile
Guidelines
Metadata tool
Customised Go-Geo! Portal
Nodes
Training
Geography
Go-Go!
Archaeology
Go-Geo!
Geological Sciences
Go-Geo!
University D
Other
resources
and portals
HFE Application Profile
Guidelines
Metadata tool
Customised Go-Geo! Portal
Nodes
Training
Geography
Go-Go!
Archaeology
Go-Geo!
Geological Sciences
Go-Geo!
Biological Sciences
Biological Sciences
Conclusions
No documented data?
No data management or data sharing?
No long-term data access issue!
Challenges:
•
Motivating people to document datasets
– seen as onerous task and left undone
– we were saying this in ’80s and situation no better now
•
Difficult to fully automate – requires human interpretation
•
It’s a people and organisational problem
– also concerns about IPR, copyright and mechanisms for sharing
• We also need to understand better the life cycles of data and
metadata as they are disseminated across the academic
community
- authorship of data and metadata as data are merged, generalised,
augmented, new data derived, new editions published
- tracking and recording digital rights as this happens
Contact details
Tony Mathys
Go-Geo! Project
tony.mathys@ed.ac.uk
Tel.: +44 (0)131 651 1443
Fax: +44 (0)131 650 3308
EDINA web site: http://edina.ac.uk
Go-Geo!: www.gogeo.ac.uk
Questions?
Download