WWW2016_knowledgewiki_final.pptx

KnowledgeWiki: An OpenSource Tool for Creating
Community-Curated Vocabulary, with a Use Case in
Materials Science
Nishita Jaykumar, Pavankalyan Yallamelli, Vinh Nguyen,
Sarasi Lalithsena, Krishnaprasad Thirunarayan, Amit Sheth
Kno.e.sis, Wright State University
Clare Paul
*Air Force Research Laboratory, Wright-Patterson AFB
WWW - LDOW 2016, Canada
Context for Research
• Collaboration with AFRL
ASM
HNDBK
MIL
HNDBK-5
(Standardized
Vocabularies)
SKOS
Dublin Core
QUDT
VAEM
…
Consolidated
vocabulary
(MatVocab)
MIL
HNDBK-17
Crowdsourcing from
domain experts
2
Motivating Example
Facts:
Name
Definition
Source
A-Basis
The mechanical property value is
the value above which …
ASM Handbook, Volume 21:
Composites.
ABasis
A statistically-based material
property; a 95% lower…
Composite Materials Handbook Volume 1.
MIL-HDBK-17F-1F, 17 June 2002
A-Basis
The lower of either a statistically
calculated number…
Metallic Materials and Elements for
Aerospace Vehicle Structures, MILHDBK-5J, 31 January 2003
3
Motivating Example
Facts:
Name
Definition
Source
YoungsModulus
The ratio of normal stress to
corresponding …
ASM Handbook, Volume
21: Composites.
ModulusYoungs
The ratio of change in stress to
change …
MIL-HDBK-17
• Same term has multiple definitions that needs to be
represented with its provenance information, that
includes data such as, source, time etc.
4
Related Work
P26s
A-Basis
Auxiliary
node1
P26v
Auxiliary node
approach
A statistically-based
material …
P580q
…
P582q
…
• Properties represented in the wikidata model do not
correspond to RDF properties
• Lack of formal semantics
5
Semantic Mediawiki
The '''United Kingdom''' is a
country located in
Representing entities and
simple metadata
[[Located in::Europe]].
• Extension to Mediawiki
• We use the Semantic Form extension of Semantic
Mediawiki for our task
• Inability to represent metadata about the metadata
http://www.slideshare.net/cool_uk/semantic-mediawiki-simple-tutorial
6
Our Approach
• Adopted the Singleton Property method for capturing
triple metadata in SMW
• Importing legacy data with provenance in bulk using
the Singleton Property method
• Importing existing RDF datasets with provenance into
SMW for curation
7
Our Approach
• Adopted the Singleton Property method for
capturing triple metadata in SMW
• Importing legacy data with provenance in bulk using
the Singleton Property method
• Importing existing RDF datasets with provenance into
SMW for curation
8
Singleton Property
A singleton property represents one specific relationship between two entities under
a certain context. It is assigned a uri, as any other property, and can be considered as
a subproperty or an instance of a generic property.
Facts:
Subject
Predicate
Object
Source
License
Autoclave
hasDefinition
“A closed vessel for
producing…”
MIL-HDBK-17F-1F,
17
All rights reserved
Singleton Property Translation
Subject
Predicate
Object
hasDefinition#1
rdf:sp
hasDefinition
Autoclave
hasDefinition#1
“A closed vessel for producing…”
hasDefinition#1
hasSource
MIL-HDBK-17
hasDefinition#1
hasLicense
All rights reserved
"Don't like RDF reification?: making statements about statements using singleton property."Proceedings of the 23rd international
conference on World wide web. ACM, 2014.
9
Why use Singleton Property?
• Formal semantics defined
• Scalable, e.g., to LOD
• Compatible with existing standards
– RDF, RDFS, SPARQL
• Can be used to capture multiple types of metadata
– Provenance, time, location
Fu, Gang, et al. "Exposing Provenance Metadata Using Different RDF Models." arXiv preprint arXiv:1509.02822 (2015). Nguyen, Vinh, Olivier Bodenreider, and Amit Sheth.
Hernández, Daniel, Aidan Hogan, and Markus Krötzsch. "Reifying RDF: What Works Well With Wikidata?." Proceedings of the 11th International Workshop on Scalable
Semantic Web Knowledge Base Systems co-located with 14th International Semantic Web Conference (ISWC 2015), Bethlehem, PA, USA. 2015.
10
Semantic Mediawiki Dataflow
Singleton
Template
Definition
Template
Regular
Template
Definition
Uses template
Category
Belongs to
Assign article to a category
Wiki Article
Property
Edit with Form
Form Definition
Field
Can use (by default)
Input type
Identifies
Represents
Has an
Has a
Data type
Has a default form
Singleton v/s Regular Template
Autoclave
Autoclave
Definition Text
Definition Text
Source
Image
Rights
Image
Source
Rights
Source
Rights
12
Regular Vs Singleton templates
Subject
Predicate
Object
Autoclave
hasDefinition
“A closed vessel…”
Autoclave
source
“ASM Handbook”
Autoclave
license
“Reproduced by…”
Autoclave
hasImage
“Image.jpg”
Subject
Predicate
Object
Autoclave
hasDefinition#1
“A closed vessel…”
hasDefinition#1
singletonPropertyOf
skos:definition
hasDefinition#1
source
“ASM Handbook”
hasDefinition#1
license
“Reproduced by…”
Autoclave
hasImage#1
“Image.jpg”
hasImage#1
singletonPropertyOf
mv:image
13
Overall Architecture
14
Implementation
• Singleton Templates are our enhancement to SMW
• Implemented parser function to handle the Singleton
Template parsing using magicWord
• Registering the magicWord
• Parser function responsible to handle/process the
Singleton Templates and generate the RDF triples
15
Use Case in Materials Science
• Properties of interest to domain experts:
– Definition Text
– Source
– License
– Creator
– Abbreviation
– Synonyms
– Units
– …..
mv: is matvocab namespace
16
Steps to create vocabularies
17
Import existing
vocabularies into SMW
SKOS,
QUDT, etc.
Create Term pages
Create relevant
properties
[[Property name:: property value]]
Create the regular template
and singleton templates
11 Templates:
6 Singleton templates
5 Regular templates
Create the Form
Material
Manufacturing
and Design From
20
MatVocab Form to create a term
21
Our Approach
• Adopted the Singleton Property method for capturing
triple metadata in SMW
• Importing legacy data with provenance in bulk using
the Singleton Property method
• Importing existing RDF datasets with provenance into
SMW for curation
20
Import legacy data with provenance
• Data from Excel spreadsheet files – 3 vocabularies
• We map CSV data into the predefined 11 templates
• Some of the data mapped to regular templates
• Others mapped to singleton templates
Structural Materials Vocabulary
21
Sample CSV data
Title
tmpltDefinText
[Definition Text]
Abasis
The mechanical
property…
Abasis
A statistically-based
material property …
Abhesive
Ablation
A material that resists
adhesion…
The degradation
decomposition…
tmpltDefinText
[Source]
tmpltDefinText
[License Agreement]
ASM Handbook
Reproduced with
permission...
MIL-HDBK-17
Approved for public
release
MIL-HDBK-17
MIL-HDBK-5J
Approved for public
release
Reproduced with
permission...
24
Statistics of the Use Case
Type
SMW
1
Number of vocabularies imported
3
2
Total number of terms imported from ASM
1295
3
Total number of terms imported from MILHNDBK-5
19
4
Total number of terms imported from MILHNDBK-17
179
5
Total number of Singleton Templates created
6
6
Total number of Regular Templates created
5
7
Total number of pages created
1,685
23
Our Approach
• Adopted the Singleton Property method for capturing
triple metadata in SMW
• Importing legacy data with provenance in bulk using
the Singleton Property method
• Importing existing RDF datasets with provenance into
SMW for curation
24
RDF import with provenance
• We developed an extension similar to the “CSV
Import”
• We experimented with the YAGO-SP2 dataset
• Ongoing work
• More details can be found in our paper
25
Further information, please visit
http://wiki.knoesis.org/index.php/KnowledgeWiki
26
RDF Import with provenance
• Curating existing RDF datasets with provenance
• No other tool facilitates such a functionality
• Implemented the “RDF Import” extension
• Implemented a method to automatically identify the
RDF subgraph structure
• Identify the regular properties and singleton
properties
27
28
RDF Reification vs. Singleton Property
Time-aware Facts:
Subject
Predicate
Object
Starts
Ends
Bob Dylan
marriedTo
Sarah Lownds
1965-11-22
1977-06-29
Standard RDF Reification
Singleton Property
Subject
Predicate
Object
Subject
Predicate
Object
#stmt1
type
Statement
marriedTo#1
rdf:sp
marriedTo
#stmt1
hasSubject
BobDylan
#stmt1
hasProperty marriedTo
#stmt1
hasObject
Sara Lownds
Bob Dylan
marriedTo
Sarah Lownds
BobDylan
marriedTo#1
Sarah Lownds
#stmt1
starts
1965-11-22
marriedTo#1
starts
1965-11-22
#stmt1
ends
1977-06-29
marriedTo#1
ends
1977-06-29
29
Other approaches
Standard
Reification
n-ary relations
Named
Graphs
30
31
Semantic Form Diagram
Source: http://edutechwiki.unige.ch/en/File:Semantic_Form_Diagram.svg
32
33
Semantic Mediawiki
The '''United Kingdom''' is a
country located in
Representing entities and
simple metadata
[[Located in::Europe]].
• Inability to represent metadata about the metadata
34
Regular Template VS Singleton Triples
Subject
Predicate
Object
subj
prop_1
value_1
subj
prop_2
value_2
Subject
Predicate
Object
Autoclave
hasDefinition
“A closed vessel…”
Autoclave
hasImage
Autoclave.png
35
Singleton Template Triples
Subject
Predicate
Object
SonePageTitle
singletonPropert#n
value_1
singletonPropert#n
singletonPropertyOf
property_1
singletonPropert#n
property_2
value_2
singletonPropert#n
property_3
value_3
Subject
Predicate
Object
Autoclave
hasDefinition#1
“A closed vessel…”
hasDefinition#1
singletonPropertyOf
skos:definition
hasDefinition#1
source
“ASM Handbook”
hasDefinition#1
license
“Reproduced by…”
36