SWEET

advertisement
SWEET:
Upper-Level Ontologies for
Earth System Science
OPeNDAP Meeting
Feb 2007
Rob Raskin
PO.DAAC
Jet Propulsion Laboratory
Data to Knowledge
Data
Basic Elements
Services
Storage
Interoperability
Volume/Density
Statistics
Analysis
Methodology
Information
Knowledge
Bytes Numbers
Models
Facts
Ingest Archive
Visualize
Infer Understand Predict
File Database HDF-EOS GIS/MIS
Ontology Mind
Syntactic OPeNDAP WMS/WCS
Semantic
High/Low
Low/High
Checksum Moments Descriptive
Inferential
Fourier Wavelet
EOF
SSA
Exploratory-analysis
Model-based-mining
Syntax
Semantics
Semantics: Shared
Understanding of Concepts



Provides a namespace for scientific terms…plus
Provides descriptions of how terms relate to one another
Example tags in markup language:



subclass, subproperty, part of, same as, transitive property,
cardinality, etc.
Enables object in “data space” to be associated formally
with object in “science concept space”
“Shared understanding” enables software tools to find
“meaning” in resources
Ontology Representation

W3C has adopted four XML-based standard
ontology languages:


Basic building blocks:



RDF, OWL-Lite, OWL-DL, OWL Full
Class, subclass, property, subproperty, sameAs
Standard language enables anyone to extend
an ontology
Knowledge built up incrementally
Why an Upper-Level Ontology
for Earth System Science?

Many common concepts used across Earth
Science disciplines (such as properties of the
Earth)



Provides common definitions for terms used in multiple
disciplines or communities
Provides common language in support of community
and multidisciplinary activities
Reduced burden (and barrier to entry) on creators
of specialized domain ontologies

Only need to create ontologies for incremental
knowledge
Semantic Web for Earth &
Environmental Terminology
(SWEET)




Ontology of Earth system science and data
concepts
Provides a common semantic framework (or
namespace) for describing Earth science
information and knowledge
Emphasis on improving search for NASA
Earth science data resources
Represented in OWL-DL
SWEET Ontologies
Integrative Ontologies
Living
Substances
Non-Living
Substances
Faceted Ontologies
Natural
Phenomena
Physical
Processes
Human Activities
Earth Realm
Physical
Properties
Data
Space
Time
Numerics
Units
SWEET Supports
Knowledge Reuse




SWEET is a concept space
Enables scalable classification of Earth science and datarelated concepts
Enables object in data space to be mapped to science
concept space
Concept space is translatable into other
languages/cultures using “sameAs” notions
SWEET Science Ontologies

Earth Realms


Physical Properties


temperature, composition, area, albedo, …
Substances


Atmosphere, SolidEarth, Ocean, LandSurface, …
CO2, water, lava, salt, hydrogen, pollutants, …
Living Substances

Humans, fish, …
SWEET Conceptual Ontologies

Phenomena



ElNino, Volcano, Thunderstorm, Deforestation,
Terrorism, physical processes (e.g., convection)
Each has associated EarthRealms,
PhysicalProperties, spatial/temporal extent, etc.
Specific instances included


e.g., 1997-98 ElNino
Human Activities

Fisheries, IndustrialProcessing, Economics,…
SWEET Numerical Ontologies

SpatialEntities



TemporalEntities



Extents: duration, century, season, …
Relations: after, before, …
Numerics



Extents: country, Antarctica, equator, inlet, …
Relations: above, northOf, …
Extents: interval, point, 0, positiveIntegers, …
Relations: lessThan, greaterThan, …
Units



Extracted from Unidata’s UDUnits
Added SI prefixes
Multiplication of two quantities carries units
Numerical Ontologies

Numeric concepts defined in OWL only through
standard XML XSD spec


Added in SWEET



Intervals defined as restrictions on real line
Numerical relations (lessThan, max, …)
Cartesian product (multidimensional spaces)
Numeric ontologies used to define spatial and
temporal concepts
XSD: Datatypes

Numeric


String


boolean, decimal, float, double, integer,
nonNegativeInteger, positiveInteger,
nonPositiveInteger, negativeInteger, long, int,
short, unsignedLong, unsignedInt, unsignedShort,
unsignedByte, hexBinary, base64Binary
String, normalizedString, anyURI, token,
language, NMTOKEN, Name, NCName
Date

dateTime, time, date, gYearMonth, gYear,
gMonthDay, gDayxsd:gMonth
Data and Services Ontology

Formats
Data models
Data Sttructures
Special values

Missing, land, sea, ice, etc.
Parameters






Scale factors, offsets, algorithms
Data Services

Subset, reproject
Example: AIRS Level 2
Dataset

Subset of Dataset where







DataModel= Level 2
Instrument= AIRS
HorizontalDimension= 2
VerticalDimension= 1
Format= HDF-EOS
Property= Temperature
Substance= Air
3DLayer
Fragment of SWEET
subClassOf
PlanetaryLayer
partOf
Atmosphere
partOf

sameAs=
“Lower
Atmosphere”
primarySubstance
=“air”
AtmosphereLayer
subClassOf
Troposphere
isUpperBoundaryOf
subClassOf
Stratosphere
upperBoundary
=50 km
lowerBoundary
=15 km
isLowerBoundaryOf
Tropopause
How SWEET was Initially
Populated

Initial sources

GCMD




Over 10,000 datasets
Over 1000 keywords
Data providers submit additional terms for “free-text” search
CF


Over 700 keywords
Very long term names


surface_downwelling_photon_spherical_irradiance_in_sea_w
ater
Decomposed into facets




Property= spherical_irradiance
Substance= sea_water
Space= surface
Direction= down
Collaboration Web Site

Discussion tools





Version Control/ Configuration Management
Trace dependencies on external ontologies
Tools to search for existing concepts in registered
ontologies
Ontology Validation Procedure



Blog, wiki, moderated discussion board
W3C note is formal submission method
Registry/discovery of ontologies
Support workflows/services for ontology development
Community Issues

Content


Standards and Conventions




Agreement on standards for use of OWL
Fuzzy representation conventions
Submit as standard to NASA Standards & Processes
Working Group
Review Board



Maintain alignment given expansion of classes and
properties
Who will oversee and maintain for perpetuity (or at least
through the next funding cycle)?
ESIP Federation? A new consortium?
Global Support

Provide tools to visualize and appreciate the big picture
Update/Matching Issues

No removal of terms except for spelling or factual
errors




Must avoid contradictions
Additions can create redundancy if sameAs not used
Humans must oversee “matching”



Subscription service to notify affected ontologies when
changes made
CF has established moderator to carry out analogous
additions
OWL “import” imports entire file
Associate community with ontology terms

Community tagging
Best Practices

Keep ontologies small, modular



Be careful that “Owl:Import” imports
everything
Use higher level ontologies where possible
Identify hierarchy of concept spaces


Model schemas
Try to keep dependencies unidirectional
Web Sites


http://sweet.jpl.nasa.gov
http://PlanetOnt.org
Download