CC image by bonus on Flickr Lesson 7: Metadata What is Metadata • Definition of metadata • Examine information included in a metadata record • Examples of metadata standards and how to choose • Illustrate the value of metadata to data users, data providers, and organizations CC image by Alec Couros on Flickr • Describe the utility of metadata for a variety of scenarios What is Metadata After completing this lesson, the participant will be able to: • Define science metadata • Give examples of metadata that you are likely to encounter in the ‘real world’ (i.e., outside of a research context) • Identify and list the types of information typically included in metadata records for environmental datasets • Identify 3 reasons metadata is of value to data users, data developers, and organizations • List 3 uses for metadata, beyond discovery of data • Identify and describe factors that may determine which metadata standards are most appropriate for a given dataset What is Metadata Plan Analyze Collect Integrate Assure Discover Describe Preserve What is Metadata What is Metadata CC image by ISAS on Flickr CC image by kukkurovaca on Flickr CC image by acordova on Flickr CC image by SEDAC on Flickr CC image by Justin See on Flickr CC image by CIMMYT on Flickr Average Temperature of Observation for Each Species Species Average Temperature Temperature Standard Deviation Number of Observations Minimum Temperature Maximum Temperature Northern Red-legged Frog 4.4 --- 1 4.4 4.4 Tailed Frog 7.0 3.0 3 4 10 Arizona Toad 10.0 --- 1 10 10 Strecker's Chorus Frog 10.5 2.0 11 9 16 Oregon Spotted Frog 11.0 15.5 2 0 22 New Jersey Chorus Frog 11.5 4.5 17 3 22 Wood Frog 12.5 5.5 897 0 28.8 Spring Peeper 13.2 5.6 569 -1 32 Red-legged Frog 13.3 5.9 16 4 27 What is Metadata What is Metadata CC image by Heather Kennedy on Flickr • Definition: A collection of data • Generally datasets can be defined as: o Spatial – a collection of logically related features arranged in a prescribed manner such as GIS map layers, water features, etc o Tabular – a file, spreadsheet, data in a table o Many tabular datasets are inherently “spatial”, e.g. water-quality samples associated with stream collection sites • Elements in a dataset can include: o Values, measures, points, coordinates, conditions, qualities, frequencies, or attributes that are a result of an observational study What is Metadata • When you provide data to someone else, what types of information would you want to include with the data? • When you receive a dataset from an external source, what types of details do you want to know about the data? What is Metadata • Providing data: o o o o Why were the data created? What limitations, if any, do the data have? What does the data mean? How should the data be cited if it is re-used in a new study? • Receiving data: o o o o o o o o What are the data gaps? What processes were used for creating the data? Are there any fees associated with the data? In what scale were the data created? What do the values in the tables mean? What software do I need in order to read the data? What projection are the data in? Can I give these data to someone else? What is Metadata • • • • • • WHO created the data? WHAT is the content of the data? WHEN were the data created? WHERE is it geographically? HOW were the data developed? WHY were the data developed? What is Metadata Photo by Michelle Chang. All Rights Reserved Metadata is: Data ‘reporting’ Author(s) Boullosa, Carmen. Title(s) They're cows, we're pigs / by Carmen Boullosa Place New York : Grove Press, 1997. Physical Descr viii, 180 p ; 22 cm. Subject(s) Pirates Caribbean Area Fiction. Format Fiction What is Metadata CC image by USDAgov on Flickr CC image by Mskadu on Flickr • Metadata is all around… • A Standard provides a structure to describe data with: o Common terms to allow consistency between records o Common definitions for easier interpretation o Common language for ease of communication o Common structure to quickly locate information • In search and retrieval, standards provide: o Documentation structure in a reliable and predictable format for CC image by ccarlstead on Flickr computer interpretation o A uniform summary description of the dataset What is Metadata What is Metadata CC image by I like on Flickr Data users Metadata helps… Organizations What is Metadata CC image by waterlilysage on Flickr Even if the value of data documentation is recognized, concerns remain as to the effort required to create metadata that effectively describe the data. What is Metadata Concern Solution workload required to capture accurate robust metadata incorporate metadata creation into data development process – distribute the effort time and resources to create, manage, and maintain metadata include in grant budget and schedule readability / usability of metadata use a standardized metadata format discipline specific information and ontologies ‘profile’ standard to require specific information and use specific values What is Metadata • Metadata allows data developers to: o Avoid data duplication o Share reliable information o Publicize efforts – promote the work of a scientist CC image by US Embassy Guyana on Flickr and his/her contributions to a field of study What is Metadata • Metadata gives a user the ability to: information from both inside and outside an organization o Find data: Determine what data exists for a geographic location and/or topic o Determine applicability: Decide if a data set meets a particular need o Discover how to acquire the dataset you identified; process and use the dataset What is Metadata CC image by ASEE on Flickr o Search, retrieve, and evaluate data set • Metadata helps ensure an organization’s investment in data: o Documentation of data processing steps, quality control, definitions, data uses, and restrictions o Ability to use data after initial intended purpose o Offers data permanence o Creates institutional memory • Advertises an organization’s research: o Creates possible new partnerships and collaborations through data sharing What is Metadata CC image by mambol on Flickr • Transcends people and time: Time of data development DATA DETAILS Specific details about problems with individual items or specific dates are lost relatively rapidly General details about datasets are lost through time Retirement or career change makes access to “mental storage” difficult or unlikely Accident or technology change may make data unusable Loss of data developer leads to loss of remaining information TIME What is Metadata (From Michener et al 1997) DATA DETAILS Sound information management, including metadata development, can arrest the loss of dataset detail. TIME What is Metadata • Metadata can support: collect o data distribution o data management o project management • If it is: o considered a component of the data o created during data development o populated with rich content classify derive planimetric imagery meta meta analysis charette alternative committee review meta PLAN meta What is Metadata What is Metadata • The descriptive content of the metadata file can be used to identify, assess, and access available data resources. IDENTIFY • keywords • geographic location • time period • attributes What is Metadata ASSESS • use constraints • access constraints • data quality • availability/pricing ACCESS • online access • order process • contacts • A metadata collection can be published to the internet via: o website catalog o web accessible folder (WAF) o Z39.50 metadata clearinghouse o metadata service o geospatial data portal Internet / Internet User Query What is Metadata Metadata Collection Intranet Dataset • Examples of metadata search portals: o Data.gov http://www.geo.data.gov o Metacat • Repository for data and metadata • http://knb.ecoinformatics.org/index.jsp o US Geological Survey • USGS Core Science Metadata Clearinghouse: http://mercury.ornl.gov/clearinghouse o ArcGIS Online • ESRI sponsored national geospatial data portal http://www.geographynetwork.com What is Metadata CC image by RGB12 on Flickr • Federal e-gov geospatial data portal What is Metadata What is Metadata • Metadata records can used to track data provenance accuracy • Data Maintenance: o Are the data current? • Do we have data older than ten years? • was before some political or geophysical event that resulted in significant change? o Are the data valid? • prior to most current source data • prior to most current methodologies • Data Update: o Contact information o Distribution policies, availability, pricing, URLs o New derivations of the dataset What is Metadata If you create metadata, you can find your own data What is Metadata CC image by Oceanit Daily Photo on Flickr If you create metadata, other people can discover your data o o o o o o CC image by NASA Goddard Spece Flight Center on Flickr • Find your data by: themes / attributes geographic location time ranges analytical methods used sources and contributors data quality Discoverable data is usable data! What is Metadata • Metadata allows you to repeat scientific process if: o methodologies are defined o variables are defined o analytical parameters are defined INPUT • Metadata allows you to defend your scientific process: o demonstrate process o increasingly GIS-savvy public requires metadata for consumer information RESULTS What is Metadata • Metadata is an exercise in data accountability. It requires you to assess: o What do you know about the dataset? o What don’t you know about the dataset? o What should you know about the dataset? Are you willing to associate yourself with the metadata record ? What is Metadata • Metadata is a declaration of: Purpose What to o the originator’s intended application of do… the data What not to Use Constraints do… o inappropriate applications of the data Completeness o features or geographies excluded from the data Distribution Liability o explicit liability of the data producer and assumed liability of the consumer What is Metadata Project Coordination What is Metadata • Metadata records can serve as a project design document: o descriptions & intent of project o geographic and temporal extent of project o source data of project o attribute requirements of project • Benefits: o expectations are clearly outlined o metadata is integrated into the process o provides a medium to record progress What is Metadata • Use metadata to monitor: o data development status Monitoring requires that the metadata be actively maintained and reviewed! milestones o QA/QC assessments o needed changes in approach time What is Metadata • Metadata can be a means to improve communications among project participants using common: o o o o o descriptions & parameters keywords, vocabularies, thesauri contact information attributes distribution information • If reviewed regularly by all participants, metadata created early and updated during the project improves opportunity for coordinating: o source data o analytical methods o new information What is Metadata • As a key component of the data, metadata should be part of any data deliverable • For quality metadata from a deliverable, the record should provide: o o o o o Citation information Data quality information Accurate geospatial information Clearly defined entities and attributes Distribution information What is Metadata What is Metadata Image courtesy of Viv Hutchinson • Dublin Core Element Set o Emphasis on web resources, publications o http://dublincore.org/documents/dces/ • FGDC Content Standard for Digital Geospatial Metadata (CSDGM) o Emphasis on geospatial data o Biological Data Profile (BDP) of the CSDGM o Profile to the CSDGM emphasis on biological data (and geospatial) o http://www.fgdc.gov/metadata/geospatial-metadata-standards • ISO 19115/19139 Geographic information: Metadata o Emphasis on geospatial data and services o http://www.fgdc.gov/metadata/geospatial-metadatastandards#fgdcendorsedisostandards What is Metadata • Ecological Metadata Language (EML) o Focus on ecological data o http://knb.ecoinformatics.org/eml_metadata_guide.html • Darwin Core o Emphasis on museum specimens o http://rs.tdwg.org/dwc/index.htm • Geography Markup Language (GML) o Emphasis on geographic features (roads, highways, bridges) o http://www.opengeospatial.org/standards/gml What is Metadata EML FGDC Title Title Abstract Abstract Entity Description Entity Type Definition Intellectual Rights Use Constraints What is Metadata • Many standards collect similar information • Factors to consider: o Your data type: • Are you working mainly with GIS data? Rastor/vector or point data? Do you have biological or shoreline information in your dataset? - Consider the FGDC Content Standard for Digital Geospatial Metadata with one of its profiles: the Biological Data Profile or the Shoreline Data Profile. • Are you working with data retrieved from instruments such as monitoring stations or satellites? Are you using geospatial data services such as applications for web-mapping applications or data modeling? - If so, then consider using the ISO 19115-2 standard • Are you mainly working with ecological data? - Consider Ecological Metadata Language (EML) What is Metadata • More Factors to consider: o Your organization’s policies: do they state which standard to use? o What resources are available to create metadata? Examples of Tools: • • • FGDC CSDGM: - Mermaid (NOAA) http://www.ncddc.noaa.gov/metadata-standards/mermaid/ - Metavist (Forest Service) http://ncrs.fs.fed.us/pubs/viewpub.asp?key=2737 - TKME (USGS) http://geology.usgs.gov/tools/metadata/tools/doc/tkme.html EML: - Morpho (http://knb.ecoinformatics.org/morphoportal.jsp) ISO: (http://www.fgdc.gov/metadata/iso-metadata-editor-review) - XML Spy or Oxegyn - CatMD o Other factors: Availability of human support; instructional materials; use of controlled vocabularies; output formats What is Metadata • Metadata is documentation of data • A metadata record captures critical information about the content of a • • • • • dataset Metadata allows data to be discovered, accessed, and re-used A metadata standard provides structure and consistency to data documentation Standards and tools vary – select according to defined criteria such as data type, organizational guidance, and available resources Metadata is of critical importance to data developers, data users, and organizations Metadata can be effectively used for: o data distribution o data management o project management • Metadata completes a dataset. Creating robust metadata is in your OWN best interest! What is Metadata The full slide deck may be downloaded from: http://www.dataone.org/education-modules Suggested citation: DataONE Education Module: Metadata. DataONE. Retrieved Nov12, 2012. From http://www.dataone.org/sites/all/documents/L07_Metadata.pp tx Copyright license information: No rights reserved; you may enhance and reuse for your own purposes. We do ask that you provide appropriate citation and attribution to DataONE. What is Metadata