KnowledgeWiki: An OpenSource Tool for Creating Community-Curated Vocabulary, with a Use Case in Materials Science Nishita Jaykumar, Pavankalyan Yallamelli, Vinh Nguyen, Sarasi Lalithsena, Krishnaprasad Thirunarayan, Amit Sheth Kno.e.sis, Wright State University Clare Paul *Air Force Research Laboratory, Wright-Patterson AFB WWW - LDOW 2016, Canada Context for Research • Collaboration with AFRL ASM HNDBK MIL HNDBK-5 (Standardized Vocabularies) SKOS Dublin Core QUDT VAEM … Consolidated vocabulary (MatVocab) MIL HNDBK-17 Crowdsourcing from domain experts 2 Motivating Example Facts: Name Definition Source A-Basis The mechanical property value is the value above which … ASM Handbook, Volume 21: Composites. ABasis A statistically-based material property; a 95% lower… Composite Materials Handbook Volume 1. MIL-HDBK-17F-1F, 17 June 2002 A-Basis The lower of either a statistically calculated number… Metallic Materials and Elements for Aerospace Vehicle Structures, MILHDBK-5J, 31 January 2003 3 Motivating Example Facts: Name Definition Source YoungsModulus The ratio of normal stress to corresponding … ASM Handbook, Volume 21: Composites. ModulusYoungs The ratio of change in stress to change … MIL-HDBK-17 • Same term has multiple definitions that needs to be represented with its provenance information, that includes data such as, source, time etc. 4 Related Work P26s A-Basis Auxiliary node1 P26v Auxiliary node approach A statistically-based material … P580q … P582q … • Properties represented in the wikidata model do not correspond to RDF properties • Lack of formal semantics 5 Semantic Mediawiki The '''United Kingdom''' is a country located in Representing entities and simple metadata [[Located in::Europe]]. • Extension to Mediawiki • We use the Semantic Form extension of Semantic Mediawiki for our task • Inability to represent metadata about the metadata http://www.slideshare.net/cool_uk/semantic-mediawiki-simple-tutorial 6 Our Approach • Adopted the Singleton Property method for capturing triple metadata in SMW • Importing legacy data with provenance in bulk using the Singleton Property method • Importing existing RDF datasets with provenance into SMW for curation 7 Our Approach • Adopted the Singleton Property method for capturing triple metadata in SMW • Importing legacy data with provenance in bulk using the Singleton Property method • Importing existing RDF datasets with provenance into SMW for curation 8 Singleton Property A singleton property represents one specific relationship between two entities under a certain context. It is assigned a uri, as any other property, and can be considered as a subproperty or an instance of a generic property. Facts: Subject Predicate Object Source License Autoclave hasDefinition “A closed vessel for producing…” MIL-HDBK-17F-1F, 17 All rights reserved Singleton Property Translation Subject Predicate Object hasDefinition#1 rdf:sp hasDefinition Autoclave hasDefinition#1 “A closed vessel for producing…” hasDefinition#1 hasSource MIL-HDBK-17 hasDefinition#1 hasLicense All rights reserved "Don't like RDF reification?: making statements about statements using singleton property."Proceedings of the 23rd international conference on World wide web. ACM, 2014. 9 Why use Singleton Property? • Formal semantics defined • Scalable, e.g., to LOD • Compatible with existing standards – RDF, RDFS, SPARQL • Can be used to capture multiple types of metadata – Provenance, time, location Fu, Gang, et al. "Exposing Provenance Metadata Using Different RDF Models." arXiv preprint arXiv:1509.02822 (2015). Nguyen, Vinh, Olivier Bodenreider, and Amit Sheth. Hernández, Daniel, Aidan Hogan, and Markus Krötzsch. "Reifying RDF: What Works Well With Wikidata?." Proceedings of the 11th International Workshop on Scalable Semantic Web Knowledge Base Systems co-located with 14th International Semantic Web Conference (ISWC 2015), Bethlehem, PA, USA. 2015. 10 Semantic Mediawiki Dataflow Singleton Template Definition Template Regular Template Definition Uses template Category Belongs to Assign article to a category Wiki Article Property Edit with Form Form Definition Field Can use (by default) Input type Identifies Represents Has an Has a Data type Has a default form Singleton v/s Regular Template Autoclave Autoclave Definition Text Definition Text Source Image Rights Image Source Rights Source Rights 12 Regular Vs Singleton templates Subject Predicate Object Autoclave hasDefinition “A closed vessel…” Autoclave source “ASM Handbook” Autoclave license “Reproduced by…” Autoclave hasImage “Image.jpg” Subject Predicate Object Autoclave hasDefinition#1 “A closed vessel…” hasDefinition#1 singletonPropertyOf skos:definition hasDefinition#1 source “ASM Handbook” hasDefinition#1 license “Reproduced by…” Autoclave hasImage#1 “Image.jpg” hasImage#1 singletonPropertyOf mv:image 13 Overall Architecture 14 Implementation • Singleton Templates are our enhancement to SMW • Implemented parser function to handle the Singleton Template parsing using magicWord • Registering the magicWord • Parser function responsible to handle/process the Singleton Templates and generate the RDF triples 15 Use Case in Materials Science • Properties of interest to domain experts: – Definition Text – Source – License – Creator – Abbreviation – Synonyms – Units – ….. mv: is matvocab namespace 16 Steps to create vocabularies 17 Import existing vocabularies into SMW SKOS, QUDT, etc. Create Term pages Create relevant properties [[Property name:: property value]] Create the regular template and singleton templates 11 Templates: 6 Singleton templates 5 Regular templates Create the Form Material Manufacturing and Design From 20 MatVocab Form to create a term 21 Our Approach • Adopted the Singleton Property method for capturing triple metadata in SMW • Importing legacy data with provenance in bulk using the Singleton Property method • Importing existing RDF datasets with provenance into SMW for curation 20 Import legacy data with provenance • Data from Excel spreadsheet files – 3 vocabularies • We map CSV data into the predefined 11 templates • Some of the data mapped to regular templates • Others mapped to singleton templates Structural Materials Vocabulary 21 Sample CSV data Title tmpltDefinText [Definition Text] Abasis The mechanical property… Abasis A statistically-based material property … Abhesive Ablation A material that resists adhesion… The degradation decomposition… tmpltDefinText [Source] tmpltDefinText [License Agreement] ASM Handbook Reproduced with permission... MIL-HDBK-17 Approved for public release MIL-HDBK-17 MIL-HDBK-5J Approved for public release Reproduced with permission... 24 Statistics of the Use Case Type SMW 1 Number of vocabularies imported 3 2 Total number of terms imported from ASM 1295 3 Total number of terms imported from MILHNDBK-5 19 4 Total number of terms imported from MILHNDBK-17 179 5 Total number of Singleton Templates created 6 6 Total number of Regular Templates created 5 7 Total number of pages created 1,685 23 Our Approach • Adopted the Singleton Property method for capturing triple metadata in SMW • Importing legacy data with provenance in bulk using the Singleton Property method • Importing existing RDF datasets with provenance into SMW for curation 24 RDF import with provenance • We developed an extension similar to the “CSV Import” • We experimented with the YAGO-SP2 dataset • Ongoing work • More details can be found in our paper 25 Further information, please visit http://wiki.knoesis.org/index.php/KnowledgeWiki 26 RDF Import with provenance • Curating existing RDF datasets with provenance • No other tool facilitates such a functionality • Implemented the “RDF Import” extension • Implemented a method to automatically identify the RDF subgraph structure • Identify the regular properties and singleton properties 27 28 RDF Reification vs. Singleton Property Time-aware Facts: Subject Predicate Object Starts Ends Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29 Standard RDF Reification Singleton Property Subject Predicate Object Subject Predicate Object #stmt1 type Statement marriedTo#1 rdf:sp marriedTo #stmt1 hasSubject BobDylan #stmt1 hasProperty marriedTo #stmt1 hasObject Sara Lownds Bob Dylan marriedTo Sarah Lownds BobDylan marriedTo#1 Sarah Lownds #stmt1 starts 1965-11-22 marriedTo#1 starts 1965-11-22 #stmt1 ends 1977-06-29 marriedTo#1 ends 1977-06-29 29 Other approaches Standard Reification n-ary relations Named Graphs 30 31 Semantic Form Diagram Source: http://edutechwiki.unige.ch/en/File:Semantic_Form_Diagram.svg 32 33 Semantic Mediawiki The '''United Kingdom''' is a country located in Representing entities and simple metadata [[Located in::Europe]]. • Inability to represent metadata about the metadata 34 Regular Template VS Singleton Triples Subject Predicate Object subj prop_1 value_1 subj prop_2 value_2 Subject Predicate Object Autoclave hasDefinition “A closed vessel…” Autoclave hasImage Autoclave.png 35 Singleton Template Triples Subject Predicate Object SonePageTitle singletonPropert#n value_1 singletonPropert#n singletonPropertyOf property_1 singletonPropert#n property_2 value_2 singletonPropert#n property_3 value_3 Subject Predicate Object Autoclave hasDefinition#1 “A closed vessel…” hasDefinition#1 singletonPropertyOf skos:definition hasDefinition#1 source “ASM Handbook” hasDefinition#1 license “Reproduced by…” 36