Marine Metadata Interoperability Initiative

advertisement
Speeding up ontology
creation of scientific terms.
Luis Bermudez , John Graybeal,
Montery Bay Aquarium Research Institute
http://marinemetadata.org
December 7, 2005
Marine Metadata Interoperability Initiative
Why are ontologies important
At AGU we
have 31
abstracts
and 2
entire
sessions
related to
ontologies
1
Marine Metadata Interoperability Initiative
Problem:
Semantic Interoperability
SSDS
AOSN
get me Data for
Parameter
temperature_1
(deg C)
get me Data for Variable
ocean_temperature (C)
2
Marine Metadata Interoperability Initiative
Need for controlled vocabulary
A set of restricted words, used by an
information community when describing
resources or discovering data. The
controlled vocabulary prevents
misspellings and avoids the use of
arbitrary, duplicative, or confusing words
that cause inconsistencies when
cataloging data.
3
Marine Metadata Interoperability Initiative
Controlled Vocabularies:
Discovery of Data
GCMD
BODC Discovery
HTML
Comma
Separated
Value
AGU Index Terms
HTML
MEL
NOAA CoRIS
Thesauri
HTML
PDF
http://gcmd.gsfc.nasa.gov/Resourc
es/valids
http://wwwtest.bodc.ac.uk/data/
codes_and_formats/parameter_cod
es/bodc_para_dict.html
http://www.agu.org/pubs/
gaplist.html
https://mel.dmso.mil/docs/metadat
a_guide/section_6.htm
http://www.coris.noaa.gov/backma
tter/keywords/discovery_
thesaurus.pdf
4
Marine Metadata Interoperability Initiative
Controlled Vocabularies:
Usage (tag the data collected)
BODC
U.S. JGOFS
Dictionary of
parameters
Comma
Separated
Value
HTML
IOC GF3 parameter
codes
HTML
Comma
Separated
SEACOOS
value
CF
XML
http://wwwtest.bodc.ac.uk/data/
codes_and_formats/parameter_cod
es/bodc_para_dict.html
http://usjgofs.whoi.edu/datasys/
param_master.html
http://ioc.unesco.org/oceanteacher
/
resourcekit/M3/Formats/Integrated
/GF3/GF3.htm
http://twiki.sura.org/twiki/pub/Mai
n/DataStandards/seacoos_draft_da
ta_ dictionary_v2.0.csv
http://www.cgd.ucar.edu/cms/eato
n/cf-metadata/standard_name.xml
5
Marine Metadata Interoperability Initiative
Problem:
Semantic Interoperability
Standard vocabularies
semantics
semantics
6
Marine Metadata Interoperability Initiative
Harmonization
Tab
Separated
Values
Comma
Separated
Values
HTML
Web Ontology
Language (OWL)
Relational
Database
DTD
XML/XSD
RDF
7
Marine Metadata Interoperability Initiative
Web Ontology Language: OWL




2003 World Wide Web Consortium
recommendation to formally express
ontologies.
Based on the Resource Description
Framework (RDF).
Can be serialized in XML.
Supporting tools: JENA, Protégé, SWOOP,
Sesame, Pangloss, Kuwari, VINE, Voc2OWL
8
Marine Metadata Interoperability Initiative
Fast introduction to OWL
RDF Triples
 RDF Resources
 Classes - individuals - properties
 RDF Graph

9
Marine Metadata Interoperability Initiative
RDF: Triples, triples, triples
id
ocean_
temperature
description
Ocean
Temperature
units
C
10
Marine Metadata Interoperability Initiative
RDF: Resource
Resources
Literal
A resource is anything on the Web that has a
unique identifier. Examples:



URI: urn:aosn.mbari.org.recordVariable.id:1900
URL: http://mmi.org/2005/08/gcmd-keyw#Chlorophyll
URL: ftp://mmi.org/data-example
11
Marine Metadata Interoperability Initiative
Classes Individuals Properties
Looks like a class
Parameters
Property (Attributes)
id
description
water temperature
Temperature_1 from unit 00471
water temperature
Temperature_2 from unit 00822
Looks like individuals of (members of)
the class Parameter
units
deg C
deg C
12
Marine Metadata Interoperability Initiative
How are ontologies created?


Conceptual direction strategy:

Up - down

Bottom - up
Automation approach:

Manual

Automatic
13
Marine Metadata Interoperability Initiative
Up - down approach
14
Marine Metadata Interoperability Initiative
Bottom - up approach
Example:
1. Properties of real
world objects are
identified.
2. Similarities are
identified.
3. Concepts are
created
4. and are expressed
as a class.
5. Classes are
related.
Lake
Is inland
body
River
Has Has a relative
water defined channel
Body of Water
Lake
River
Class
Subclass
15
Marine Metadata Interoperability Initiative
Bottom - up approach
Example:
ssds:Parameter
1. Real word objects:
id
description
units
Temperature
deg C
parameters in
Temperature inside
rdf:type
the OASIS can, in
temperature
degrees C
observatory
temperature
measured inside the
systems.
Temperature
MMC controller
Temperature
Celsius
2. They all have
Temperature
degrees C
Temperaturedescription
water temperature
id
units deg C
temperature
similar properties
ocean_temperature
Oceanwater
Temperature
C
Temperature_1
from
unit
00471
deg C
ocean_temperature_2
Ocean Temperature
2 C
temperature
Oceanwater
Temperature
(id, description and ocean_temperature_all
Temperature_2 from unit 03533C
deg C
All
0=good,
units).
ocean_temperature_
Ocean Temperature 1=missing,
qcflag
Qcflag
2=marginal,
3. Make them a
3=bad
Ocean Temperature
counts
resource: instance ocean_temperature_raw Raw
Sea
Surface
sea_surface_temperature
C
of a class
Temperature
aosn:Variable
Parameter
16
Marine Metadata Interoperability Initiative
Bottom - up approach
(cont.)
sweet:Property
mmi:Parameter
ssds:Parameter
aosn:Variable
17
Marine Metadata Interoperability Initiative
Manual (Ontology editor)
Protégé
List of more than 50 editors:
http://www.xml.com/2002/11/06/Ontology_Editor_Survey.html
18
Marine Metadata Interoperability Initiative
Automatic
id
ocean_temperature
ocean_temperature_2
description
units
Ocean Temperature C
Ocean Temperature 2 C
Ocean Temperature
ocean_temperature_all
C
All
0=good,
ocean_temperature_
Ocean Temperature 1=missing,
qcflag
Qcflag
2=marginal,
3=bad
Ocean Temperature
ocean_temperature_raw
counts
Raw
Sea
Surface
sea_surface_temperature
C
Temperature
transformation
Properties file
Software
Program
Ontology in
OWL
19
Marine Metadata Interoperability Initiative
Automatic

Advantages




Fast
Preserves a connection with the source (
back - compatibility )
Avoids typing and copy/paste errors
Disadvantage

Only works with simple vocabularies (
Flat vocabularies, and some taxonomies)
20
Marine Metadata Interoperability Initiative
VOC2OWL
Tool created by MMI
 Allows to create automatic - bottom -up
ontologies from two basic structures of
simple vocabularies:




Flat vocabularies (e.g. phone directory)
Hierarchical vocabularies (e.g.
taxonomies)
JAVA - Eclipse standalone application
21
Marine Metadata Interoperability Initiative
22
Marine Metadata Interoperability Initiative
Metadata
23
Marine Metadata Interoperability Initiative
Conversion Properties I/O
Format of the
ASCII file to
transform:
tab or csv
Location of the
ASCII file
Location where
the ontology in
OWL will be
saved
24
Marine Metadata Interoperability Initiative
Ontology Conversion Properties
One class (at
least) is always
created.
More than
one class can
be created
Namespace of
the resources
Column from where the
local names of the
resources (individuals)
will be created.
25
Marine Metadata Interoperability Initiative
Result
Parameters
id
description
water temperature
Temperature_1 from unit 00471
water temperature
Temperature_2 from unit 00822
units
deg C
deg C
26
Marine Metadata Interoperability Initiative
Ontology Conversion Properties
If treated as a hierarchy, there is no
such primary class. All the lines in
the ASCII file represent a hierarchy
27
Marine Metadata Interoperability Initiative
Example Hierarchy (GCMD)
28
Marine Metadata Interoperability Initiative
Has been tested !
About 50 vocabularies
were converted to OWL
for the MMI workshop “
Advancing Domain
Vocabularies” (Aug,
2005)
29
Marine Metadata Interoperability Initiative
Why do we need all these
ontologies ?
Workshop was about relating
terms from one controlled
vocabulary to another one.
Microsoft Excel was to hard to
use for this purpose -:)
30
Marine Metadata Interoperability Initiative
Mapping results
47 participants and 12 hours of mapping time
Topic
Direct
mappings
Inferred
mappings
Total
mappings
Plant Pigments
405
1,022
1,427
PaCOOS
131
375
506
Waves
93
181
274
Currents
90
153
243
CTD
81
432
513
Habitats
23
37
60
823
2,200
3,023
Total
31
Marine Metadata Interoperability Initiative
VINE : Vocabulary Integration
Environment
32
Marine Metadata Interoperability Initiative
More…
• Advance the Marine Knowledge:
250,000 RDF triples (Ontologies +
mappings)
• They are available as:
• SOAP web services at:
http://marinemetadata.org/webservices
• Ontology files at:
http://marinemetadata.org/ns
33
Marine Metadata Interoperability Initiative
Conclusions
• Solving semantic interoperability
issues is fun.
• We need to relate data producers
vocabularies with standard
vocabularies.
• OWL is growing and growing in
popularity more and more tools will
be available.
• VOC2OWL can help you !
34
Marine Metadata Interoperability Initiative
Our Guides
Executive Committee


John Graybeal, MBARI. (PI)
Philip Bogden, SURA/SCOOP



Stephen Miller, SIO.
Francisco Chavez, MBARI.
Stephanie Watson, Texas A&M
Steering Committee






Roy Lowry, BODC
Robert Arko, LDEO
Julie Bosch, NOAA
Ben Domenico, Unidata
Karen Stocks, SDSC
Steve Hankin, NOAA Ocean.US/DMAC





Mark Musen, Stanford Univ
Michael Parke, Univ of Hawaii
Lola Olsen, NASA Goddard
Bob Weller, WHOI
Dawn Wright, Oregon State
University
35
Marine Metadata Interoperability Initiative
MMI:
Your Handy Reference Guide
MMI: http://marinemetadata.org
Voc2OWL: http://marinemetadata.org/voc2owl
Vine: http://marinemetadata.org/vine
Help Line: ask@marinemetadata.org
Ontologies: http://marinemetadata.org/ns
Term Search:
http://mmi.mbari.org:9600/mmi2/search.jsp
Tethys: http://marinemetadata.org/tethys
36
Download