Using Darwin Core as a Model: An Ontologically Minimalist Approach to Publishing Occurrence Data in RDF Joel Sachs Formal Models track of the Semantics for Biodiversity Symposium TDWG 2013 The first thing I want to communicate: Semantics != Ontologies Semantics = Ontologies ? • Semantics – Semiotics – Linguistics – Psychology • Ontology – Philosophy – Computer Science Ontologies as a vehicle for semantics • Ontologies were the first choice for putting the “semantic” in semantic web. • But ontologies aren’t the only way to supply semantics. • Furthermore, ontologies can be a barrier to shared semantics, in a number of ways. What’s green? • Def 1: What’s green? • Def 2: Green is the portion of the electromagnetic spectrum with a wavelength between 520 – 570 nm. What’s electromagnetic? What’s a spectrum? What’s a wavelength? What’s a nanomemter? Occurrence Occurrence_ID Latitude Longitude Scientific Name Vernacular Name Occurrence Occurrence_ID Location_ID URI DateTime DateTime IndividualOrganism_ID URI Location Location_ID Latitude Longitude Datum URI float float URI Identification Taxon Taxon_ID Scientific Name Vernacular Name Authorship Year etc. Identification_ID Individual_ID Taxon Identified_by URI URI URI There are many ways to think about biodiversity data. Thing #2 that I want to communicate Darwin Core (as it is) can be used as a light weight “ontology”. Don’t try this at home Thing #3 How to minimize the amount of ontology in the Core. Example: Material Sample dwctype:MaterialSample (roughly?) corresponds to OBI:Specimen. <owl:Class rdf:about=http://purl.obolibrary.org/obo/OBI_0100051> <owl:equivalentClass> <owl:Class> <owl:intersectionOf rdf:parseType="Collection"> <rdf:Description rdf:about="http://purl.obolibrary.org/obo/BFO_0000040"/> <owl:Restriction> <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000087"/> <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/OBI_0000112"/> </owl:Restriction> </owl:intersectionOf> </owl:Class> </owl:equivalentClass> <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/BFO_0000040"/> <owl:disjointWith rdf:resource="http://purl.obolibrary.org/obo/BFO_0000141"/> <n0pred:IAO_0000602>(forall (x) (if (MaterialEntity x) (IndependentContinuant x))) // axiom label in BFO2 CLIF: [019-002] </n0pred:IAO_0000602> <n0pred:BFO_0000179>material</n0pred:BFO_0000179> <n0pred:BFO_0000180>MaterialEntity</n0pred:BFO_0000180> <n0pred:IAO_0000602>(forall (x) (if (and (Entity x) (exists (y t) (and (MaterialEntity y) (continuantPartOfAt x y t)))) (MaterialEntity x))) // axiom label in BFO2 CLIF: [021-002] </n0pred:IAO_0000602> <n0pred:IAO_0000602>(forall (x) (if (and (Entity x) (exists (y t) (and (MaterialEntity y) (continuantPartOfAt y x t)))) (MaterialEntity x))) // axiom label in BFO2 CLIF: [020-002] </n0pred:IAO_0000602> curl -L -H "Accept: application/rdf+xml" http://rs.tdwg.org/dwc/dwctype/MaterialSample | grep OBI <rdf:Description rdf:about="http://rs.tdwg.org/dwc/dwctype/MaterialSample"> <rdfs:label xml:lang="en-US">MaterialSample</rdfs:label> <rdfs:comment xml:lang="en-US">A resource describing the physical results of a sampling (or subsampling) event. In biological collections, the material sample is typically collected, and either preserved or destructively processed.</rdfs:comment> <rdfs:isDefinedBy rdf:resource="http://rs.tdwg.org/dwc/dwctype/"/> <dcterms:issued>2013-0328</dcterms:issued> <dcterms:modified>2013-09-26</dcterms:modified> <rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/> <dcterms:hasVersion rdf:resource="http://rs.tdwg.org/dwc/dwctype/history/ #MaterialSample-201306-24"/> <dcam:memberOf rdf:resource="http://rs.tdwg.org/dwc/terms/DwCType"/> <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/OBI_0100051"/> <dwcattributes:status>recommended</dwcattributes:status> <dwcattributes:decision rdf:resource="http://rs.tdwg.org/dwc/terms/history/decisions/ Decision_2013-10-09_12"/> <dwcattributes:abcdEquivalence>DataSets/DataSet/Units/Unit</dwcattributes:abcdEquivalence> </rdf:Description> curl -L -H "Accept: application/rdf+xml" http://rs.tdwg.org/dwc/dwctype/MaterialSample | grep OBI <rdf:Description rdf:about="http://rs.tdwg.org/dwc/dwctype/MaterialSample"> <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/OBI_0100051"/> </rdf:Description> On the one hand • Nobody forces consuming application to ingest the OBI and BFO ontologies when they ingest Darwin Core. • So what’s the big deal? On the other hand • Many semantic web clients automatically fetch and load referenced documents. – Especially if the documents are referenced with important properties like rdfs:subClassOf • It’s bad form (and slightly dangerous) to clutter a semantic web document with terms from unnecessary namespaces. My suggestion? • Assertions that tie Core terms to upper ontologies should be asserted in a separate document. E.g. <rdf:Description rdf:about="http://rs.tdwg.org/dwc/dwctype/MaterialSample"> <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/OBI_0100051"/> </rdf:Description> should be asserted in obi.owl, or dwc_obi.owl • That way, those doing integration that depends on OBI axioms can ingest the appropriate descriptions. • Those that don’t need the OBI axioms don’t have to worry about incorrect inference. – Keep in mind: There is no preferred upper ontology for science on the semantic web. • BFO, Dolce, SUMO, UMBEL, NULO, etc. Thank you for paying attention! Question, comments, and criticism to @xjsachs joel.sachs@agr.gc.ca