Speeding up ontology creation of scientific terms. Luis Bermudez , John Graybeal, Montery Bay Aquarium Research Institute http://marinemetadata.org December 7, 2005 Marine Metadata Interoperability Initiative Why are ontologies important At AGU we have 31 abstracts and 2 entire sessions related to ontologies 1 Marine Metadata Interoperability Initiative Problem: Semantic Interoperability SSDS AOSN get me Data for Parameter temperature_1 (deg C) get me Data for Variable ocean_temperature (C) 2 Marine Metadata Interoperability Initiative Need for controlled vocabulary A set of restricted words, used by an information community when describing resources or discovering data. The controlled vocabulary prevents misspellings and avoids the use of arbitrary, duplicative, or confusing words that cause inconsistencies when cataloging data. 3 Marine Metadata Interoperability Initiative Controlled Vocabularies: Discovery of Data GCMD BODC Discovery HTML Comma Separated Value AGU Index Terms HTML MEL NOAA CoRIS Thesauri HTML PDF http://gcmd.gsfc.nasa.gov/Resourc es/valids http://wwwtest.bodc.ac.uk/data/ codes_and_formats/parameter_cod es/bodc_para_dict.html http://www.agu.org/pubs/ gaplist.html https://mel.dmso.mil/docs/metadat a_guide/section_6.htm http://www.coris.noaa.gov/backma tter/keywords/discovery_ thesaurus.pdf 4 Marine Metadata Interoperability Initiative Controlled Vocabularies: Usage (tag the data collected) BODC U.S. JGOFS Dictionary of parameters Comma Separated Value HTML IOC GF3 parameter codes HTML Comma Separated SEACOOS value CF XML http://wwwtest.bodc.ac.uk/data/ codes_and_formats/parameter_cod es/bodc_para_dict.html http://usjgofs.whoi.edu/datasys/ param_master.html http://ioc.unesco.org/oceanteacher / resourcekit/M3/Formats/Integrated /GF3/GF3.htm http://twiki.sura.org/twiki/pub/Mai n/DataStandards/seacoos_draft_da ta_ dictionary_v2.0.csv http://www.cgd.ucar.edu/cms/eato n/cf-metadata/standard_name.xml 5 Marine Metadata Interoperability Initiative Problem: Semantic Interoperability Standard vocabularies semantics semantics 6 Marine Metadata Interoperability Initiative Harmonization Tab Separated Values Comma Separated Values HTML Web Ontology Language (OWL) Relational Database DTD XML/XSD RDF 7 Marine Metadata Interoperability Initiative Web Ontology Language: OWL 2003 World Wide Web Consortium recommendation to formally express ontologies. Based on the Resource Description Framework (RDF). Can be serialized in XML. Supporting tools: JENA, Protégé, SWOOP, Sesame, Pangloss, Kuwari, VINE, Voc2OWL 8 Marine Metadata Interoperability Initiative Fast introduction to OWL RDF Triples RDF Resources Classes - individuals - properties RDF Graph 9 Marine Metadata Interoperability Initiative RDF: Triples, triples, triples id ocean_ temperature description Ocean Temperature units C 10 Marine Metadata Interoperability Initiative RDF: Resource Resources Literal A resource is anything on the Web that has a unique identifier. Examples: URI: urn:aosn.mbari.org.recordVariable.id:1900 URL: http://mmi.org/2005/08/gcmd-keyw#Chlorophyll URL: ftp://mmi.org/data-example 11 Marine Metadata Interoperability Initiative Classes Individuals Properties Looks like a class Parameters Property (Attributes) id description water temperature Temperature_1 from unit 00471 water temperature Temperature_2 from unit 00822 Looks like individuals of (members of) the class Parameter units deg C deg C 12 Marine Metadata Interoperability Initiative How are ontologies created? Conceptual direction strategy: Up - down Bottom - up Automation approach: Manual Automatic 13 Marine Metadata Interoperability Initiative Up - down approach 14 Marine Metadata Interoperability Initiative Bottom - up approach Example: 1. Properties of real world objects are identified. 2. Similarities are identified. 3. Concepts are created 4. and are expressed as a class. 5. Classes are related. Lake Is inland body River Has Has a relative water defined channel Body of Water Lake River Class Subclass 15 Marine Metadata Interoperability Initiative Bottom - up approach Example: ssds:Parameter 1. Real word objects: id description units Temperature deg C parameters in Temperature inside rdf:type the OASIS can, in temperature degrees C observatory temperature measured inside the systems. Temperature MMC controller Temperature Celsius 2. They all have Temperature degrees C Temperaturedescription water temperature id units deg C temperature similar properties ocean_temperature Oceanwater Temperature C Temperature_1 from unit 00471 deg C ocean_temperature_2 Ocean Temperature 2 C temperature Oceanwater Temperature (id, description and ocean_temperature_all Temperature_2 from unit 03533C deg C All 0=good, units). ocean_temperature_ Ocean Temperature 1=missing, qcflag Qcflag 2=marginal, 3. Make them a 3=bad Ocean Temperature counts resource: instance ocean_temperature_raw Raw Sea Surface sea_surface_temperature C of a class Temperature aosn:Variable Parameter 16 Marine Metadata Interoperability Initiative Bottom - up approach (cont.) sweet:Property mmi:Parameter ssds:Parameter aosn:Variable 17 Marine Metadata Interoperability Initiative Manual (Ontology editor) Protégé List of more than 50 editors: http://www.xml.com/2002/11/06/Ontology_Editor_Survey.html 18 Marine Metadata Interoperability Initiative Automatic id ocean_temperature ocean_temperature_2 description units Ocean Temperature C Ocean Temperature 2 C Ocean Temperature ocean_temperature_all C All 0=good, ocean_temperature_ Ocean Temperature 1=missing, qcflag Qcflag 2=marginal, 3=bad Ocean Temperature ocean_temperature_raw counts Raw Sea Surface sea_surface_temperature C Temperature transformation Properties file Software Program Ontology in OWL 19 Marine Metadata Interoperability Initiative Automatic Advantages Fast Preserves a connection with the source ( back - compatibility ) Avoids typing and copy/paste errors Disadvantage Only works with simple vocabularies ( Flat vocabularies, and some taxonomies) 20 Marine Metadata Interoperability Initiative VOC2OWL Tool created by MMI Allows to create automatic - bottom -up ontologies from two basic structures of simple vocabularies: Flat vocabularies (e.g. phone directory) Hierarchical vocabularies (e.g. taxonomies) JAVA - Eclipse standalone application 21 Marine Metadata Interoperability Initiative 22 Marine Metadata Interoperability Initiative Metadata 23 Marine Metadata Interoperability Initiative Conversion Properties I/O Format of the ASCII file to transform: tab or csv Location of the ASCII file Location where the ontology in OWL will be saved 24 Marine Metadata Interoperability Initiative Ontology Conversion Properties One class (at least) is always created. More than one class can be created Namespace of the resources Column from where the local names of the resources (individuals) will be created. 25 Marine Metadata Interoperability Initiative Result Parameters id description water temperature Temperature_1 from unit 00471 water temperature Temperature_2 from unit 00822 units deg C deg C 26 Marine Metadata Interoperability Initiative Ontology Conversion Properties If treated as a hierarchy, there is no such primary class. All the lines in the ASCII file represent a hierarchy 27 Marine Metadata Interoperability Initiative Example Hierarchy (GCMD) 28 Marine Metadata Interoperability Initiative Has been tested ! About 50 vocabularies were converted to OWL for the MMI workshop “ Advancing Domain Vocabularies” (Aug, 2005) 29 Marine Metadata Interoperability Initiative Why do we need all these ontologies ? Workshop was about relating terms from one controlled vocabulary to another one. Microsoft Excel was to hard to use for this purpose -:) 30 Marine Metadata Interoperability Initiative Mapping results 47 participants and 12 hours of mapping time Topic Direct mappings Inferred mappings Total mappings Plant Pigments 405 1,022 1,427 PaCOOS 131 375 506 Waves 93 181 274 Currents 90 153 243 CTD 81 432 513 Habitats 23 37 60 823 2,200 3,023 Total 31 Marine Metadata Interoperability Initiative VINE : Vocabulary Integration Environment 32 Marine Metadata Interoperability Initiative More… • Advance the Marine Knowledge: 250,000 RDF triples (Ontologies + mappings) • They are available as: • SOAP web services at: http://marinemetadata.org/webservices • Ontology files at: http://marinemetadata.org/ns 33 Marine Metadata Interoperability Initiative Conclusions • Solving semantic interoperability issues is fun. • We need to relate data producers vocabularies with standard vocabularies. • OWL is growing and growing in popularity more and more tools will be available. • VOC2OWL can help you ! 34 Marine Metadata Interoperability Initiative Our Guides Executive Committee John Graybeal, MBARI. (PI) Philip Bogden, SURA/SCOOP Stephen Miller, SIO. Francisco Chavez, MBARI. Stephanie Watson, Texas A&M Steering Committee Roy Lowry, BODC Robert Arko, LDEO Julie Bosch, NOAA Ben Domenico, Unidata Karen Stocks, SDSC Steve Hankin, NOAA Ocean.US/DMAC Mark Musen, Stanford Univ Michael Parke, Univ of Hawaii Lola Olsen, NASA Goddard Bob Weller, WHOI Dawn Wright, Oregon State University 35 Marine Metadata Interoperability Initiative MMI: Your Handy Reference Guide MMI: http://marinemetadata.org Voc2OWL: http://marinemetadata.org/voc2owl Vine: http://marinemetadata.org/vine Help Line: ask@marinemetadata.org Ontologies: http://marinemetadata.org/ns Term Search: http://mmi.mbari.org:9600/mmi2/search.jsp Tethys: http://marinemetadata.org/tethys 36