The-Rosetta-Model-Paper-1 - UCLA Institute for Geophysics

advertisement
DRAFT – The Rosetta Model: Can the Different Physical Science Data Models be
Reconciled? – DRAFT
Todd A King1 (tking@igpp.ucla.edu)
Deborah L McGuinness2,3 (dlm@ksl.stanford.edu)
Raymond J Walker1 (rwalker@igpp.ucla.edu)
Peter Fox4 (pfox@ucar.edu)
D Aaron Roberts5 (aaron.roberts@nasa.gov)
Christopher Harvey6 (christopher.harvey@cesr.fr)
1
Insitute of Geophysics and Planetary Physics/UCLA, 2835 Slichter Hall, Los Angeles, CA 90095-1567, United States
Rensselaer Polytechnic Institute
3
Stanford University, 353 Serra Hall, Stanford, CA 94305, United States
4
UCAR, 1850 Table Mesa Drive, Boulder, CO 80305, United States
5
NASA/NSSDC, Code 692 NASA Goddard Space Flight Center, Greenbelt, MD 20771
6
Centre de Données de la Physique des Plasmas (CDPP), 18 avenue Edouard Belin, TOULOUSE 31 401, France
2
1. Abstract
There are a variety of data models in the physical sciences, some of which are in
overlapping domains. Each of the data models have been derived in different ways. Some
have been based on formal ontologies, others on informal ontologies and others on
relational schemas. An additional complication is that different international agencies
have divided the physical science domains into different sub-domains leading to some
confusion as to which data model to adopt. The most prevalent data models in use today
are the Planetary Data System (PDS), Space Physics Archive Search and Extract
(SPASE), Virtual Solar Terrestrial Observatory (VSTO), the International Virtual
Observatory Alliance (IVOA) and the Global Change Master Directory (GCMD). We
take a comparative look at the various data models and ask the questions: Can they be
reconciled? Is it possible to have a Rosetta Model to translate between each of the
models? What role can ontologies play in defining a Rosetta Model?
2. Descriptions and Metadata
There are many different information models and classification ontologies in use today.
Each is designed for a particular application. Some are very general and others are
tailored for a specific discipline. Some of the most widely used are:
CAA: Cluster Active Archive. Designed to support the archiving and distribution
of high quality calibrated data products from ESA's Cluster mission, using
an approach general enough to be applicable to other environments. It has
a Mission, Observatory, Instrument hierarchy. The recovered data &
metadata is adequate for API use. 480 terms (198 classes and, 282
enumeration elements); Purpose: Resource Discovery, Resource Sharing,
Arching, Content Classification.; Specification: Narrative and XML
Schema
Dublin Core: Originally designed for information resources (documents) and has
been expanded to include data, images, movies, and other types of
resources. 27 terms (15 core, 12 element types). Purpose: Resource
Discovery (published works).; Specification: Narrative
IVOA: The International Virtual Observatory Alliance (IVOA) is a set of
standards to "facilitate the international coordination" of the "utilization of
astronomical archives as an integrated and interoperating virtual
observatory." Standards set by the IVOA include VOTable, VOResource,
Unified Content Descriptor (UCD). 63 terms (6 categories, 57 terms) and
486 UCD terms for data classification. Purpose: Resource Discovery (data,
collections, services, and curation) and Content Classification.;
Specification: Narrative and XML Schema
OAI-ORE: The Object Reuse and Exchange (ORE) activity of the Open
Archives Initiative (OAI) which is developing specifications that allow
distributed repositories to exchange information about their constituent
digital objects. The first release of the ORE specifications is scheduled for
March 8, 2008. The OAI-ORE is distinct from the OAI-PMH (a protocol
for exchanging metadata) – Conceptual only. Purpose: Compound Object
Description.
PDS3: The Planetary Data System (PDS) is a data set nomenclature designed to
be consistent across discipline boundaries and standards for labeling data
files. Its intent is archive planetary science data and supporting
information to enable effective use and interpretation. 14,458 terms (1643
elements and 81 objects. 12,734 standard values (2,848 target names, 144
volume sets, 1,966 volumes and 1,370 data set IDs)). Purpose: Archiving;
Specification: Narrative, ODL with PDS vocabulary
SPASE: The Space Physics Archive Search and Extract (SPASE) is a data model
designed for the Solar and Space Physics communities to unify the data
environment to facilitate finding, retrieving, formatting, and obtaining
basic information about data essential for research. 340 terms (10 resource
types, 35 entities (containers), 30 enumerations, 55 attributes. 265 items
which are values used in enumeration (controlled lists)). Purpose:
Resource Discovery, Resource Sharing and Content Classification;
Specification: Narrative, XML Schema and XMI
SWEET: Semantic Web for Earth and Environmental Terminology (SWEET)
provides a common semantic framework for various Earth science
initiatives. There are 17 ontologies consisting of biosphere,
human_activities, process, substance, data_center, material_thing,
property, sunrealm, data, numerics, sensor, time, earthrealm, phenomena,
space, and units. 3,940 terms (17 ontologies). Purpose: Reference Model;
Specification: OWL
VSTO: Virtual Solar Terrestrial Observatory. Originally designed as a set of
ontologies for organizing and integrating information spanning upper
atmospheric terrestrial physics to solar physics. Fundamental classes
include instrument, observatory, data, and services. Its upper level has
been reused in other science areas including volcanology and plate
tectonics. 407 terms (one ontology with 35 top-level classes). Purpose:
Resource Discovery, Resource Sharing, and Content Classification.
Specification: OWL
3. The Rosetta Model
Participants
• Mission
• Observatory
• Instrument
o Detector
• Person
• Reference
• Target
Product
• Sample (Physical)
• Data Structure (Digital)
o Catalog (record collection)
o Table (row, column)
o Image (x, y, z)
o Movie (x, y, z, t)
o n-Array
o Compound Structure (?)
• Documents
Resource
• Repository
• Registry
• Web Link
• Service
Collection
• Dataset
• Event
• Campaign
Annotation
• Notes
• Terms
• Associations
A. Cluster Active Archive
Designed to support the archiving and distribution of high quality calibrated data
products from ESA's Cluster mission, using an approach general enough to be applicable
to other environments. It has a Mission, Observatory, Instrument hierarchy. The
recovered data & metadata is adequate for API use.
From the Cluster Metadata Dictionary, Issue: 2, Date: May 4, 2006 Rev. : 2
Metadata is information which describes a dataset. It should be complete, that is, contain
all the information required to read and interpret the bits (syntactic description), and to
understand what the resulting numerical values (or bit strings) represent (semantic
description), including how the data was obtained ; the latter information impacts upon
the scientific significance of the data. The purpose of the CAA Metadata Dictionary is to
describe fully the required CAA metadata information, and to explain how that
information must be formatted so as to be exploitable by the generic software of Cluster
Active Archive.
There are 6 top-level CAA concepts or classes:
Level
Description
Mission
This level contains information relevant to the whole mission.
Observatory The Cluster mission consists of 4 observatories : Cluster-1, Cluster-2,
Cluster-3, and Cluster-4.
Experiment The Cluster mission has 11 experiments, each identified by its Principal
Investigator, plus the auxiliary data. Instrument The Cluster instruments
are identified by Observatory and Experiment.
Dataset
Each instrument produces one or more datasets ; this level of metadata is
common to the whole of each dataset.
Parameter
File
A dataset contains one or more parameters, each of which has its own
metadata
Each dataset is composed of ¯les, the number of which will grow
regularly with time during CAA.
For CAA, there will be :
one block of metadata at the mission level (for the Cluster mission),
four blocks at the observatory level (Cluster-1, Cluster-2, Cluster-3, Cluster-4)
eleven blocks at the experiment level (one for each of the eleven instruments),
sixty blocks of metadata (listed on page 32) and the instrument level, plus
a further six blocks of metadata for the various auxiliary data products.
To recover all the metadata relative to any one dataset it is necessary to know the relation
between these blocks of metadata. For example, when looking at the metadata associated
with the CIS-1 instrument (CIS instrument on Spacecraft 1) it is necessary to know that
this is associated with metadata concerning the Experiment CIS and the Observatory
Spacecraft-1, and that these are associated with the Mission Cluster. Linkage between the
different levels (illustrated by the arrows in Fig. 1) is provided at each level by concept
keywords included specially for this purpose.
Overall Characteristics
Scope: 480 terms (198 classes, 282 enumeration elements)
Purpose: Resource Discovery, Resource Sharing, Arching, Content Classification.
Specification: XML Schema
References
[CAA] Cluster Metadata Dictionary
http://caa.estec.esa.int/documents/DataD_V22.pdf
B. Dublin Core
Originally designed for information resources (documents) and has been expanded to
include data, images, movies, and other types of resources.
From Wikipedia
The Dublin Core standard includes two levels: Simple and Qualified. Simple Dublin Core
comprises fifteen elements; Qualified Dublin Core includes three additional elements
(Audience, Provenance and RightsHolder), as well as a group of element refinements
(also called qualifiers) that refine the semantics of the elements in ways that may be
useful in resource discovery.
Simple Dublin Core
The Simple Dublin Core Metadata Element Set (DCMES) consists of 15 metadata
elements:
1. Title
2. Creator
3. Subject
4. Description
5. Publisher
6. Contributor
7. Date
8. Type
9. Format
10. Identifier
11. Source
12. Language
13. Relation
14. Coverage
15. Rights
Each Dublin Core element is optional and may be repeated. The DCMI has established
standard ways to refine elements and encourage the use of encoding and vocabulary
schemes. There is no prescribed order in Dublin Core for presenting or using the
elements.
Full information on element definitions and term relationships can be found in the Dublin
Core Metadata Registry [DCMR].
Qualified Dublin Core
Subsequent to the specification of the original 15 elements, an ongoing process to
develop exemplary terms extending or refining the Dublin Core Metadata Element Set
(DCMES) was begun. The additional terms were identified, generally in working groups
of the Dublin Core Metadata Initiative, and judged by the DCMI Usage Board to be in
conformance with principles of good practice for the qualification of Dublin Core
metadata elements.
Element refinements make the meaning of an element narrower or more specific. A
refined element shares the meaning of the unqualified element, but with a more restricted
scope. The guiding principle for the qualification of Dublin Core elements, colloquially
known as the Dumb-Down Principle, states that an application that does not understand a
specific element refinement term should be able to ignore the qualifier and treat the
metadata value as if it were an unqualified (broader) element. While this may result in
some loss of specificity, the remaining element value (without the qualifier) should
continue to be generally correct and useful for discovery.
DCMI also maintains a small, general vocabulary recommended for use within the
element Type. This vocabulary currently consists of 12 terms:
1. Collection
2. Dataset
3. Event
4. Image
5. InteractiveResource
6. MovingImage
7. PhysicalObject
8. Service
9. Software
10. Sound
11. StillImage
12. Text
In addition to element refinements, Qualified Dublin Core includes a set of recommended
encoding schemes, designed to aid in the interpretation of an element value. These
schemes include controlled vocabularies and formal notations or parsing rules [DCENC].
A value expressed using an encoding scheme may thus be a token selected from a
controlled vocabulary (e.g., a term from a classification system or set of subject headings)
or a string formatted in accordance with a formal notation (e.g., "2000-12-31" as the
standard expression of a date). If an encoding scheme is not understood by an application,
the value may still be useful to a human reader.
Overall Characteristics
Scope: 27 terms (15 core, 12 element types) [5].
Purpose: Resource Discovery (published works).
Specification: Narrative
References
[DCMR] Dublin Core Official web site
http://dublincore.org/dcregistry/
[DCENC] Dublin Core Encoding Guidelines
http://dublincore.org/resources/expressions/
[DCXML] Guidelines for implementing Dublin Core in XML
http://dublincore.org/documents/abstract-model/
C. IVOA
The International Virtual Observatory Alliance (IVOA) is a set of standards to "facilitate
the international coordination" of the "utilization of astronomical archives as an
integrated and interoperating virtual observatory." Standards set by the IVOA include
VOTable, VOResource, Unified Content Descriptor (UCD).
Excerpts from various IVOA documents [IVOA]
VOResource
The IVOA Resource Metadata specification (VOResource) permits describing the
following attributes of a resource [VORES]:
Identity metadata
Title, ShortName, Identifier,
Curation metadata
Publisher, PublisherID, Creator, Creator.Logo, Contributor, Date, Version,
Contact.Name, Contact.Address, Contact.Email, Contact.Telephone
General content metadata
Subject, Description, Source, ReferenceURL, Type, ContentLevel,
Relationship, RelationshipID
Collection and service content metadata
Facility, Instrument, Coverage.Spatial, Coverage.RegionOfRegard,
Coverage.Spectral, Coverage.Spectral.Bandpass,
Coverage.Spectral.MinimumWavelength,
Coverage.Spectral.MaximumWavelength, Coverage.Temporal.StartTime,
Coverage.Temporal.StopTime, Coverage.Depth, Coverage.ObjectDensity,
Coverage.ObjectCount, Coverage.SkyFraction, Resolution.Spatial,
Resolution.Spectral, Resolution.Temporal, UCD, Format, Rights
Data quality metadata
DataQuality, ResourceValidationLevel, ResourceValidatedBy,
Uncertainty.Photometric,
Uncertainty.Spatial, Uncertainty.Spectral, Uncertainty.Temporal
Service metadata
Service.AccessURL, Service.InterfaceURL, Service.BaseURL,
Service.HTTPResultsMIMEType, Service.StandardID,
Service.MaxSearchRadius, Service.MaxReturnRecords,
Service.MaxReturnSize
Unified Content Descriptors
Unified Content Descriptors (UCD) is a formal vocabulary for astronomical data that is
controlled by the International Virtual Observatory Alliance (IVOA). The vocabulary is
restricted in order to avoid proliferation of terms and synonyms, and controlled in order
to avoid ambiguities. A UCD is used to classify a token of information. For example, it
may be used to identify the type of information in a field of a table or a tagged value in
metadata description [VOUCD].
All existing UCD1+ words are grouped into 12 main categories. These categories
are expressed by the first atom of the word, whose possible values are:
1. arith (arithmetics)
2. em (electromagnetic spectrum)
3. instr (instrument)
4. meta (metadata)
5. obs (observation)
6. phot (photometry)
7. phys (physics)
8. pos (positional data)
9. spect (spectral data)
10. src (source)
11. stat (statistics)
12. time (time)
VOTable
The VOTable format is an XML standard for the interchange of data represented as a set
of tables [VOTAB]. It extends the HTML Table specification by adding metadata to
describe the contents of the table. This includes the data type, units and classification of
the contents of each field in a table. The VOTable format also permits encode binary data
to be included in the table or reference external streams of binary data.
Overall Characteristics
Scope: 63 terms (6 categories, 57 terms) and
486 UCD terms for data classification.
Purpose: Resource Discovery (data, collections, services, and curation)
Content Classification.
Specification: Narrative and XML Schema
References
[IVOA] IVOA Web Site
http://www.ivoa.net/
[VORES] Resource Metadata for the Virtual Observatory, Version 1.12, IVOA
Recommendation 2007 March 2.
http://www.ivoa.net/Documents/latest/RM.html
[VOUDC] The UCD1+ controlled vocabulary, Version 1.23, IVOA Recommendation 02
April 2007
http://www.ivoa.net/Documents/latest/UCDlist.html
[VOTAB] VOTable Format Defnition, Version 1.1, IVOA Recommendation 2004-08-11
http://www.ivoa.net/Documents/latest/VOT.html
[VOASTR] Ontology of Astronomical Object Types, Version 1.0, IVOA Working Draft
2007 Feb 19
http://www.ivoa.net/Documents/WD/Semantics/AstrObjectOntology20070219.pdf
D. OAI-ORE
The Object Reuse and Exchange (ORE) activity of the Open Archives Initiative (OAI)
which is developing specifications that allow distributed repositories to exchange
information about their constituent digital objects. The first release of the ORE
specifications is scheduled for March 8, 2008. The OAI-ORE is distinct from the OAIPMH (a protocol for exchanging metadata)
Excerpts from the Object Reuse and Exchange white paper [OAIORE]
Compound information objects are aggregations of distinct information units that when
combined form a logical whole. Some examples of these are a digitized book that is an
aggregation of chapters, where each chapter is an aggregation of scanned pages; a music
album that is the aggregation of several audio tracks; an image object that is the
aggregation of a high quality master, a medium quality derivative and a low quality
thumbnail; a scholarly publication that is aggregation of text and supporting materials
such as datasets, software tools, and video recordings of an experiment; and a multi-page
web document with an HTML table of contents that points to multiple interlinked HTML
individual pages. If we consider all information objects reusable in multiple contexts (a
notable feature of networked information), then the aggregation of a specific information
unit into a compound object is not due to the inherent nature of the information unit, but
the result of the intention of the human author or machine agent that composed the
compound object.
Research in the Semantic Web community has introduced the notion of named graphs[5],
which are essentially a set of RDF assertions, forming a graph, to which a URI is
assigned. The graph as a whole then can be treated as a web resource, and assertions
such as metadata statements, authority, etc. can be associated with that resource. These
ideas are very promising as an approach to expressing the notion of a compound object
on the web. However, they remain in a research phase, and need further specification in
order to become adoptable as part of an implementable interoperability specification.
Our proposals described later in this document build on this notion of a named graph.
A core goal of OAI-ORE – Object Reuse and Exchange – is to develop standardized,
interoperable, and machine-readable mechanisms to express compound object
information on the web. The OAI-ORE standards will make it possible for web clients
and applications to reconstruct the logical boundaries of compound objects, the
relationships among their internal components, and their relationships to the other
resources in the web information space. This will provide the foundation for the
development of value-adding services for analysis, reuse, and re-composition of
compound objects, especially in the areas of e-Science, e-Scholarship, and scholarly
communication, which are the target applications of ORE
To enable widespread adoption of the standards developed by OAI-ORE we have
determined that they must be congruent with and leverage the Web Architecture. This
architecture essentially consists of:





URIs that identify
resources, which are “items of interest”, that,
when accessed through standard protocols such as HTTP, return
representations of current resource state
and which are linked via URI references.
The combination of nodes, which denote resources, and arcs, which assert the
relationships among those resources, forms the web graph, and HTTP access to this graph
is the basis for services (e.g. robot-based search engines) and data mining (e.g., link
analysis) from which new information and knowledge is derived.
An illustration of publishing a compound object on the web:
Overall Characteristics
In development
[OREPROJ] Open Archives Initiative - Object Reuse and Exchange Project
http://www.openarchives.org/ore/
[OAIORE] Open Archives Initiative – Object Reuse and Exchange, Compound
Information Objects: The OAI-ORE Perspective, May 28, 2007
http://www.openarchives.org/ore/documents/CompoundObjects-200705.html
E. PDS3
The Planetary Data System (PDS) is a data set nomenclature designed to be consistent
across discipline boundaries and standards for labeling data files. Its intent is archive
planetary science data and supporting information to enable effective use and
interpretation.
Excerpts from PDS documents [PDS]
A mission archive should contain sufficient documentation of the mission, the
instrument(s), and calibration procedures necessary for members of the current and future
science community to effectively use and, if appropriate, recalibrate the data. An archive
includes complete information about the geometry relevant to the observations (e.g.,
spacecraft position and orientation relative to the target). It also includes catalog files that
may be ingested into the PDS database along with the raw data and higher order data
products. [PDSAPG]
PDS defines a data set as a logical grouping of data products. Data sets may be combined
into data set collections. For example, all of the data sets from one instrument from a
given mission could be considered a data set collection.
Each product is assigned a Product ID which is unique within a data set. In turn each data
set has a unique ID, as well as each data set collection. It is possible to refer to a specific
product using a combination of Data Set ID and Product ID.
All data submitted to PDS must be accompanied by a set of catalog files which briefly
describe the mission, instrument host (that is, the spacecraft or other facility within which
the instrument operates), instrument, and data set. Additional catalog files identify key
personnel and references cited in other catalog files. The set of 6 core catalog files is:
MISSION.CAT:
INSTHOST.CAT:
INST.CAT:
DATASET.CAT:
PERSON.CAT:
REF.CAT:
Mission description
Instrument host (spacecraft) description.
Instrument description.
Dataset description.
Person description.
Reference description.
Descriptions of targets may also be describe in a Target Catalog file (TARGET.CAT).
Descriptions of products are stored in a "label" file which is stored with the data. A label
may be included at the beginning of a data file and describe its contents (attached label)
or in a separate file (detached label). References to files within a label can not contain
paths so the label must existing at the same location (folder/directory) in the file system
as the data.
The contents of a label conform to the Object Definition Language (ODL) specification
and the current release of the PDS data dictionary. [PDSSD] The data dictionary
describes each object and allowed elements. A PDS label can describe the detailed
structure and format of a data file. This includes the binary representation of the data,
record structure and bit-level location of information.
There are current 1643 elements and 81 objects in the PDS data dictionary. Many
elements in PDS have standard value lists (a controlled list of possible values). PDS
defines allowed standard values as part of an element definition (not as a discrete element
in the dictionary). There are a total of 12,734 standard values. In the PDS system each
target (planet, moon, asteroid, etc) has a standard value as does each volume and dataset.
There are 2,848 target names, 144 volume sets, 1,966 volumes and 1,370 data set IDs.
Overall Characteristics
Scope: 14,458 terms (1643 elements and 81 objects. 12,734 standard values
(2,848 target names, 144 volume sets, 1,966 volumes
and 1,370 data set IDs))
Purpose: Archiving
Specification: Narrative and ODL with PDS vocabulary
References
[PDS] PDS Documentation
http://pds.nasa.gov/documents/
[PDSAPG] Planetary Data System - Archive Preparation Guide (APG), Aug 29, 2006,
Version 1.1
http://pds.nasa.gov/documents/apg/apg_Aug_29h.pdf
[PDSSR] PDS Standards Reference
http://pds.nasa.gov/documents/sr/index.html
F. SPASE
The Space Physics Archive Search and Extract (SPASE) is a data model designed for the
Solar and Space Physics communities to unify the data environment to facilitate finding,
retrieving, formatting, and obtaining basic information about data essential for research.
Excerpts from SPASE Documentation[SPASE-DM]
The Solar and Space Physics communities need a unified data environment to facilitate
finding, retrieving, formatting, and obtaining basic information about data essential for
their research. With the increasing requirement for data from multiple sources, this need
has become acute. A unified method to describe data and other resources is the key to
achieving this unified environment. The SPASE (Space Physics Archive Search and
Extract) Data Model provides a basic set of terms and values organized in a simple and
homogeneous way, to facilitate access to Solar and Space Physics resources. The SPASE
Model will provide the detailed information at the parameter level required for Solar and
Space Physics applications.
The Data Model provides enough detail to allow a scientist to understand the content of
Data Products (e.g., a set of files for 3 second resolution Geotail magnetic field data
for1992 to 2005), together with essential retrieval and contact information. A typical use
would be to have a collection of descriptions stored in one or more related internet-based
registries of products; these could be queried with specifically designed search engines
which link users to the data they need. The Data Model also provides constructs for
describing components of a data delivery system. This includes repositories, registries
and services.
Resources
At the top level of the Data Model is the Resource Type. Each type of resource has a
tailored set of attributes to describe the resource. For data products the resource types
are:
Numerical Data,
Display Data, and
Catalog
and the resource types that support these:
Observatory,
Instrument,
Registry,
Repository,
Service,
Granule,
Person
Each resource is assigned a unique resource identifier (URI) so that it can be referenced
by other resources, within publications, by a user or by service.
There are currently 10 resource types in SPASE. The data dictionary contains 378 terms
consisting of 35 entities (containers), 30 enumerations, 55 attributes and 265 items which
are values used in controlled lists (enumerations).
Overall Characteristics
Scope: 340 terms (10 resource types, 35 entities (containers), 30 enumerations,
55 attributes. 265 items which are values used in enumeration
(controlled lists))
Purpose: Resource Discovery, Resource Sharing and Content Classification;
Specification: Narrative, XML Schema and XMI
References
[SPASE] SPASE web site
http://www.spase-group.org/
[SPASE-DM] A Space and Solar Physics Data Model from the SPASE Consortium,
Version: 1.2.0, Release Date: 2007-05-22.
http://www.spase-group.org/data/doc/spase-1_2_0.pdf
G. SWEET
Semantic Web for Earth and Environmental Terminology (SWEET) provides a common
semantic framework for various Earth science initiatives. There are 17 ontologies
consisting of biosphere, human_activities, process, substance, data_center, material_thing,
property, sunrealm, data, numerics, sensor, time, earthrealm, phenomena, space, and
units.
Excerpts from SWEET Documents [SWEETGUIDE]
The ontologies within the Semantic Web for Earth and Environmental Terminology
(SWEET) provide an upper-level ontology for Earth system science. The SWEET
ontologies include several thousand terms, spanning a broad extent of Earth system
science and related concepts (such as data characteristics) using the OWL language. The
ontologies can be downloaded from http://sweet.jpl.nasa.gov/sweet. To support such a
large collection and adhere to the guiding principles, the concepts are divided, where
possible, into orthogonal dimensions or facets in support of reductionism. The primary
ontologies are shown in Figure and explained below. Each box represents a separate
ontology, and a connecting line indicates where major properties are used to define
concepts across ontology spaces.
Ontologies revised and validated Jan 26, 2006
Earth Realm
Physical Phenomena
Physical Process
Physical Property
Physical Substance
Sun Realm
Biosphere
Data
Data Center
Human Activity
Material Thing
Numerics
Sensor
Space
Time
Units
Overall Characteristics
Scope: 3,940 terms (17 ontologies)
Purpose: Reference Model
Specification: OWL
References
[SWEET] SWEET Web site
http://sweet.jpl.nasa.gov/
[SWEETGUIDE] Guide to SWEET Ontologies, Rob Raskin, NASA/Jet Propulsion Lab
http://sweet.jpl.nasa.gov/guide.doc
H. VSTO
Virtual Solar Terrestrial Observatory. Originally designed as a set of ontologies for
organizing and integrating information spanning upper atmospheric terrestrial physics to
solar physics. Fundamental classes include instrument, observatory, data, and services. Its
upper level has been reused in other science areas including volcanology and plate
tectonics.
Excerpts from the VSTO documents [VSTO]
The Virtual Solar Terrestrial Observatory (VSTO) project provides an electronic
repository of observational data spanning the solar-terrestrial physics domain. VSTO is a
distributed, scalable education and research environment for searching, integrating, and
analyzing observational, experimental and model databases in the fields of solar, solarterrestrial and space physics (SSTSP) and utilizes semantic web technologies. We are
also implementing tools and infrastructure for accessing and using the data. Our main
contributions include the repository, infrastructure, and tools for the particular solar
terrestrial physics as well as the design and infrastructure that may be broadened to cover
more diverse science areas and communities of use.
Overall Characteristics
Scope: 407 terms (one ontology with 35 top-level classes)
Purpose: Resource Discovery, Resource Sharing, and Content Classification.
Specification: OWL
References
[VSTO] VSTO Project web site
http://vsto.hao.ucar.edu/
Download