Semantic Technologies in Practice – Introduction to Part II Eva Blomqvist 2012-09-17

advertisement
Semantic Technologies in Practice
– Introduction to Part II
Eva Blomqvist
2012-09-17
September 13,
2012
1
Outline
n 
n 
n 
Course information
Introduction to the Semantic Web and the Web of Data
Re-engineering and publishing of linked data
September 13, 2012
2
Course information
Department of Computer and Information Science (IDA)
Linköpings universitet, Sweden
September 13,
2012
3
Course information – Web page
n 
n 
n 
n 
http://www.ida.liu.se/~evabl45/semtechpracticecourse.en.shtml
Part I – if you feel you need to freshen up your background
Part II – slides, other material
Part III – ideas on course projects
Department of Computer and Information Science (IDA)
Linköpings universitet, Sweden
September 13, 2012
4
Course requirements – Part II (4hp)
n 
Attendance at the seminars
q 
q 
q 
n 
If you cannot attend, let me know and you will get some reading
material instead + write a summary
Exercises are completed on your own
Deadline 30/11 for all seminars
Completing the exercises
q 
q 
q 
Exercises can be done individually or in groups of 2 (max)
Hand in through e-mail
If you are not finished at the end of the session?
n 
n 
n 
q 
Complete the tasks and send result by e-mail
DEADLINE for seminar exercises – 30/11
Firm deadline – no second chance
What does “complete” mean?
n 
Hand in your result, it does not have to be completely correct but
you have to show that you have tried and done the whole exercise!
Department of Computer and Information Science (IDA)
Linköpings universitet, Sweden
September 13, 2012
5
Course idea – Part II
n 
Meet international researchers in the field and get an
idea of the spectrum of research on applications of
semantic technologies
Get a hands-on feeling for some of the technologies that
exist and are used by practitioners today
Get to know some tools
Get ideas for how to use semantic technologies
(practically) in your own projects
Get ideas for Part III
n 
What it’s not about:
n 
n 
n 
n 
q 
q 
Knowing all the details of tools and formats
Theory (at least not that much...)
Department of Computer and Information Science (IDA)
Linköpings universitet, Sweden
September 13, 2012
6
Schedule
n 
20/9, 10-16 (John von Neumann) - Fabio Ciravegna
teaches how to use semantic technologies for analyzing
social media data
n 
?? – Collaborative ontology engineering using XD
n 
29/10, 10-16 (John von Neumann) - Kurt Sandkuhl (topic
is still under preparation)
n 
31/10, 13-15 (Alan Turing) + 1/11, 9-12 (John von
Neumann) - Valentina Presutti teaches how to use
semantic technologies for handling and enhancing CMS
content
Department of Computer and Information Science (IDA)
Linköpings universitet, Sweden
September 13, 2012
7
Introduction to the Semantic
Web and the Web of Data
Department of Computer and Information Science (IDA)
Linköpings universitet, Sweden
September 13,
2012
8
Web vs. Semantic Web
What do you see?
Semantik gör webben smartare
Publicerad av CIO Sweden
Morgondagens internet är inte bara anpassat för människor. Nya lösningar gör att
maskiner kan göra smartare sökningar.
Den semantiska webben är tänkt som en förlängning av den befintliga webben snarare än som
en helt ny webb. Internet är en underbar uppfinning. Inte minst den del som kallas World Wide
Web, eller webben, och som gör det möjligt att söka sig fram i en grafisk miljö via hyperlänkar.
Det funkar på många sätt utmärkt – för oss människor.
Vi kan presentera dokument som visas på skärmar runt om hela jordklotet och andra människor
kan, förutsatt att det språk som används är begripligt för båda parter, ta del av informationen.
Men för maskinen är det värre. Webben av idag är läsbar för maskiner. Men den är inte
förståbar.
…
Av: Robert Brännström
cioreporter@idg.se
What does a computer see?
Semantik
gör
webben
Publicerad av
CIO
smartare
Sweden
Morgondagens internet är inte bara anpassat för människor. Nya
lösningar gör att maskiner kan göra smartare sökningar.
Den semantiska webben är tänkt som en förlängning av den
befintliga webben snarare än som en helt ny webb. Internet är
en underbar uppfinning. Inte minst den del som kallas World Wide
Web, eller webben, och som gör det möjligt att söka sig fram i en
grafisk miljö via hyperlänkar. Det funkar på många sätt utmärkt
– för oss människor.
Vi kan presentera dokument som visas på skärmar runt om hela
jordklotet och andra människor kan, förutsatt att det språk som
används är begripligt för båda parter, ta del av informationen.
Men för maskinen är det värre. Webben
maskiner. Men den är inte förståbar.
…
Av: Robert Brännström
cioreporter@idg.se
av
idag
är
läsbar för
But what about XML tags?
<titel> Semantik gör webben smartare</titel>
Publicerad av <utgivare> CIO Sweden </utgivare>
<ingress> Morgondagens internet är inte bara anpassat för människor. Nya lösningar gör att maskiner
kan göra smartare sökningar. </ingress>
<brödtext> Den semantiska webben är tänkt som en förlängning av den befintliga webben snarare än som en
helt ny webb. Internet är en underbar uppfinning. Inte minst den del som kallas World Wide Web, eller webben,
och som gör det möjligt att söka sig fram i en grafisk miljö via hyperlänkar. Det funkar på många sätt utmärkt – för
oss människor.
Vi kan presentera dokument som visas på skärmar runt om hela jordklotet och andra människor kan, förutsatt att
det språk som används är begripligt för båda parter, ta del av informationen.
Men för maskinen är det värre. Webben av idag är läsbar för maskiner. Men den är inte förståbar.
… </brödtext>
Av: <författare> Robert Brännström </författare>
<e-post> cioreporter@idg.se </e-post>
However…
<titel> Semantik
gör
webben
Publicerad av <utgivare> CIO
smartare</titel>
Sweden </utgivare>
<ingress> Morgondagens internet är inte bara anpassat för
människor. Nya lösningar gör att maskiner kan göra smartare
sökningar. </ingress>
<brödtext> Den semantiska webben är tänkt som en förlängning av
den befintliga webben snarare än som en helt ny webb.
Internet är en underbar uppfinning. Inte minst den del som
kallas World Wide Web, eller webben, och som gör det möjligt att
söka sig fram i en grafisk miljö via hyperlänkar. Det funkar på
många sätt utmärkt – för oss människor.
Vi kan presentera dokument som visas på skärmar runt om hela
jordklotet och andra människor kan, förutsatt att det språk som
används är begripligt för båda parter, ta del av
informationen.
Men för maskinen är det värre. Webben
maskiner. Men den är inte förståbar.
… </brödtext>
av
idag
Av: <författare> Robert Brännström </författare>
<e-post> cioreporter@idg.se </e-post>
är
läsbar för
Semantic tags?
<dc:title> Semantik gör webben smartare</dc:title>
Publicerad av
<dc:publisher> CIO Sweden </dc:publisher>
<dc:abstract> Morgondagens internet är inte bara anpassat för människor.
Nya lösningar gör att maskiner kan göra smartare sökningar. </dc:abstract>
<example:content> Den semantiska webben är tänkt som en förlängning av
den befintliga webben snarare än som en helt ny webb. Internet är
en underbar uppfinning. Inte minst den del som kallas World Wide Web,
eller webben, och som gör det möjligt att söka sig fram i en grafisk
miljö via hyperlänkar. Det funkar på många sätt utmärkt – för oss
människor.
Vi kan presentera dokument som visas på skärmar runt om hela
jordklotet och andra människor kan, förutsatt att det språk som
används är begripligt för båda parter, ta del av informationen.
Men för maskinen är det värre. Webben
maskiner. Men den är inte förståbar.
… </example:content>
<dc:creator> Robert Brännström </dc:creator>
<foaf:mbox> cioreporter@idg.se </foaf:mbox>
Av:
av
idag
är
läsbar för
Resources for the Semantic Web
n 
Metadata
q 
q 
n 
Terminologies
q 
q 
n 
Resources are marked-up with descriptions of their content.
No good unless everyone speaks the same language
Provide shared and common vocabularies of a domain, so
search engines, agents, authors and users can
communicate.
No good unless everyone means the same thing
Ontologies
q 
Provide a shared and common understanding of a domain
that can be communicated across people and applications,
and will play a major role in supporting information
exchange and discovery
The Semantic Web Layers
RDF
n 
RDF stands for Resource Description Framework
n 
It is a W3C Recommendation
q 
n 
RDF is a graphical formalism (+XML syntax + semantics)
q 
q 
n 
http://www.w3.org/RDF
for representing (meta)data
for describing the semantics of information in a machineaccessible way
Provides a simple data model based on triples
RDF Data Model
n 
Statements are <subject, predicate, object> triples:
q 
<Sean,hasColleague,Ian>
n 
Can be represented as a graph:
n 
Statements describe properties of resources
n 
A resource is any object that can be pointed to by a URI:
q 
n 
A document, a picture, a paragraph on the Web,
http://www.cs.man.ac.uk/index.html, a book in the
library, a real person (?), isbn://0141184280
Properties themselves are also resources (URIs)
Linking RDF Statements
n 
The subject of one statement can be the object of
another
n 
Such collections of statements form a directed, labeled
graph
n 
Note that the object of a triple can also be a “literal”
(e.g. a string)
What does RDF give us?
n 
A mechanism for annotating data and resources.
q 
Supported by for instance RDFa as a link to “normal” web
pages
n 
Single (simple) data model.
n 
Syntactic consistency between names (URIs).
n 
Low-level integration of data.
q 
E.g., through “same as”-statements
Querying RDF using SPARQL
<http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> "SPARQL Tutorial" .
SELECT ?title
WHERE {
<http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title .
}
"SPARQL Tutorial"
Linked Data - http://linkeddata.org/
n 
RDF data published on the web according to a set of
principles:
1. 
2. 
3. 
4. 
n 
n 
Use URIs as names for things
Use HTTP URIs so that people can look up those names
When someone looks up a URI, provide useful
information, using the standards (RDF, SPARQL)
Include links to other URIs, so that they can discover more
things
The 5-star model to grade the datasets
Linking Open Data Project – LOD
q 
An initiative to publish open linked data on the web
LOD data
A LOD Example: DBPedia
n 
n 
n 
n 
Extracts structured information (RDF) from Wikipedia
Browsable through a number of tools
(see http://wiki.dbpedia.org/OnlineAccess)
Queryable through SPARQL endpoint(s)
Example query to DBPedia: “All soccer players,
who played as goalkeeper for a club that has a stadium
with more than 40.000 seats and who are born in a
country with more than 10 million inhabitants”
Microformats
n 
n 
Embedding “semantic” information directly into the
(X)HTML – “semantic” in the human sense
Introduces new values for existing XHTML attributes
distributed under a
<a rel="license" href="http://creativecommons.org/licenses/by/3.0/">Creative Commons License</a>
n 
“license” is a reserved keyword, for expressing a
licensing relation è there are no namespaces!
Drawback – no formal definition of the “keywords”
The hCard Microformat
n 
n 
hCard is a microformat representation of the common
vCard format – embedding vCards into HTML
http://microformats.org/wiki/hcard
BEGIN:VCARD
VERSION:3.0
N:Çelik;Tantek
FN:Tantek Çelik
URL:http://tantek.com
END:VCARD
<div class="vcard">
<a class="url fn" href="http://tantek.com/"> Tantek Çelik </a>
</div>
Semantic Annotations - RDFa
n 
Embedding RDF data in HTML pages
q 
q 
RDF data is produced in a simple manner
Can be connected to some formal interpretation (ontology)
<div xmlns:dc="http://purl.org/dc/elements/1.1/">
<h2 property="dc:title">The Trouble with Bob</h2>
<h3 property="dc:creator">Alice</h3>
</div>
n 
n 
Uses XML namespaces to refer to the definition of the
concepts used, e.g., the Dublin Core ontology
http://www.w3.org/TR/xhtml-rdfa-primer/
Querying RDFa
Querying RDFa
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:foaf="http://xmlns.com/foaf/0.1/">
<head profile="http://www.w3.org/1999/xhtml/vocab">
<title>Alice in Semantic Wonderland</title>
<base href="http://example.com/alice"></base>
<link rel="stylesheet" type="text/css" href="http://www.w3.org/2006/07/SWD/RDFa/primer/style.css" />
</head>
<body> <h1>Alice in Semantic Wonderland</h1>
<div id="meta"><a href="http://tinyurl.com/5v7jzc">
<img src="http://www.w3.org/Icons/SW/Buttons/sw-rdfa-gray.png" alt="get metadata in RDF Turtle"/></a>
</div>
<div about="/posts/trouble_with_bob">
<h2 property="dc:title">The trouble with Bob</h2>
<h3 property="dc:creator">Alice</h3>
<p>The trouble with Bob is that he takes much better photos than I do:</p>
<div class="imgbox" about="http://www.w3.org/2006/07/SWD/RDFa/primer/sunset.jpg">
<img src="http://www.w3.org/2006/07/SWD/RDFa/primer/sunset.jpg" alt="sunset" />
<div><span property="dc:title">Beautiful Sunset</span>
by <span property="dc:creator">Bob</span>.
</div>
</div> … </body> </html>
Querying RDFa
@prefix dc: <http://purl.org/dc/elements/1.1/>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix cc: <http://creativecommons.org/ns#>.
<http://example.com/alice> cc:license <http://creativecommons.org/licenses/by/3.0/>;
<http://example.com/posts/trouble_with_bob> dc:creator "Alice";
dc:title "The trouble with Bob".
<http://example.com/alice#me> a foaf:Person;
foaf:knows [ a foaf:Person; foaf:homepage <http://example.com/manu>; foaf:name "Manu" ],
[ a foaf:Person; foaf:homepage <http://example.com/bob>; foaf:name "Bob" ],
[ a foaf:Person; foaf:homepage <http://example.com/eve>; foaf:name "Eve" ];
foaf:mbox <mailto:alice@example.com>;
foaf:name "Alice Birpemswick";
foaf:phone <tel:+1-617-555-7332> .
</2006/07/SWD/RDFa/primer/sunset.jpg>
dc:creator "Bob";
dc:title "Beautiful Sunset" .
<http://example.com/posts/jos_barbecue>
dc:creator "Eve";
dc:title "Jo's Barbecue" .
Querying RDFa
@prefix dc: <http://purl.org/dc/elements/1.1/>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix cc: <http://creativecommons.org/ns#>.
<http://example.com/alice> cc:license <http://creativecommons.org/licenses/by/3.0/>;
<http://example.com/posts/trouble_with_bob> dc:creator "Alice";
dc:title "The trouble with Bob".
<http://example.com/alice#me> a foaf:Person;
Who
is the creator
of the
post with<http://example.com/manu>;
the title "The trouble
with Bob"?
foaf:knows
[ a foaf:Person;
foaf:homepage
foaf:name
"Manu" ],
[ a foaf:Person; foaf:homepage <http://example.com/bob>; foaf:name "Bob" ],
[ a foaf:Person; foaf:homepage <http://example.com/eve>; foaf:name "Eve" ];
PREFIX dc: <http://purl.org/dc/elements/1.1/>
foaf:mbox <mailto:alice@example.com>;
SELECT
?creator_of_post
foaf:name "Alice Birpemswick";
FROM
<http://www.w3.org/2007/08/pyRdfa/extract?uri=
foaf:phone
<tel:+1-617-555-7332> .
http://www.w3.org/2006/07/SWD/RDFa/primer/alice-example.html>
dc:creator "Bob";
WHERE { ?post
dc:title ?post_title
;
dc:title "Beautiful Sunset" .
dc:creator ?creator_of_post .
<http://example.com/posts/jos_barbecue>
dc:creator
FILTER regex(?post_title,
"The"Eve";
trouble with Bob", "i")
dc:title "Jo's Barbecue" .
}
</2006/07/SWD/RDFa/primer/sunset.jpg>
Reengineering and
Refactoring
Slides partly by Aldo Gangemi, STLab, ISTC-CNR, Italy
September 13, 2012
32
Why do I need transformations?
n 
A scenario
q 
q 
n 
my system fetches knowledge from different sources in
LOD
each of these sources uses its own ontology/vocabulary
Another scenario
q 
q 
I have legacy data in a DB or in a custom XML format
This data should be integrated with RDF data
How to arrive at a homogeneous
representation of knowledge expressed with
heterogeneous schemas/vocabularies?
Motivations
the Web of Data is fed
by “triplifiers”, tools able
to transform content
to Linked Data
lack of good practices for
knowledge representation
and organization
triplifiers implement
various methods typically based
on bulk recipes which allow for
no or limited customization
of the process
the transformation relies on
predetermined implicit assumptions
on the domain semantics of the
non-RDF data source
An Example
dbpedia:
Person
rdf:type
dbpedia:
Bob_Marley
foaf:name
“Bob Marley”
skos:
Concept
rdf:type
I want to aggregate the
two graphs
nyt:
65169961111056171853
“Marley, Bob”
skos:pre
fLabel
A DB stores data and answers queries
‣ STLab was founded in 2008
‣ STLab is in Italy
‣ STLab does research on
‣ Aldo is 48
‣ Aldo works in Rome
‣ Aldo does research on
Semantic Web
semantic technologies
nome
data di
nascita
luogo_di_la
voro
temi
Aldo
08-16-962
Roma
Alfio
….
….
Persons
laboratori
data_di_fond
azione
sede
temi
Semantic
Web
STLab
2008
Italia
Tecnologie
Semantiche
….
LOA
….
….
….
Labs
Complex queries?
n 
Who is interested in “Semantic Web” and is working in the same country as
STLab is located?
workplace
same concept?
seat
Rome
same country?
Italy
nome
data di
nascita
luogo_di
_lavoro
temi
Aldo
08-16-96
2
Roma
Semantic
Web
Alfio
….
….
….
laboratori
data_di_fo
ndazione
sede
temi
STLab
2008
Italia
Tecnologie
Semantiche
LOA
….
….
….
No answer Persons
Labs
37
Is mapping enough?
mapped ... workplace
Rome
nome
seat
same country?
data di
nascita
luogo_di
_lavoro
temi
Aldo
08-16-96
2
Roma
Semantic
Web
Alfio
….
….
….
And this one? ... Italy
laboratori
data_di_fo
ndazione
sede
temi
STLab
2008
Italia
Tecnologie
Semantiche
LOA
….
….
….
... Persons
38
Labs
Publishing DB data as RDF on the Web
dbpedia:Rome
dbpedia:Italy
foaf:based_near
foaf:based_near
labs:STLab
perscnr:Aldo
triplification
nome
data di
nascita
luogo_di
_lavoro
temi
Aldo
08-16-96
2
Roma
Semantic
Web
Alfio
….
….
….
Persons
laboratori
data_di_fo
ndazione
sede
temi
STLab
2008
Italia
Tecnologie
Semantiche
LOA
….
….
….
Labs
Data linking and querying on the web of data
dbpedia:Rome
foaf:based_near
perscnr:Aldo
dbpedia:subdivisionName
dbpedia:Italy
foaf:based_near
labs:STLab
foaf:topic_interest
dbpedia:Semantic_Web
•  Who is interested
in “Semantic
Web” and is
working in the
same country as
STLab is located?
How to answer really tough queries?
dbpedia:Rome
dbpediap:subdivisionName
foaf:based_near
dbpedia:Italy
foaf:based_near
owl:sameAs
labs:STLab
perscnr:Aldo
foaf:topic_interest
dbpedia:Semantic_Web
eurostat:Italien
eurostat:unemployment_rate_total
‣  Who is interested in
4.8
“Semantic Web” and is
working in a country
where the
unemployment rate is
lower than 5%?
Knowledge transformation issues
n 
n 
n 
42
Syntactic interoperability bottleneck (platform + data model)
q 
e.g. rdb, eav, xml, text, prolog, N3
q  e.g. rdb with adjacency list, path enumeration
Semantic interoperability bottleneck (logical + conceptual level)
q 
e.g. rdb with: lexical, statistical, formal data
q  e.g. two different databases on the same topic
Social, pragmatic interoperability bottleneck (privacy,
sustainability, policy)
q 
e.g. different requirements, organizational contexts, etc.
42 Dealing with web semantics:
current state
n 
n 
Much enthusiasm, a lot of nice, different ideas
Much confusion and mutual misunderstanding
between “scruffies” and “neats”
q 
q 
q 
q 
Pushing formal semantics beyond its limits (e.g. the
“owl:sameAs” dispute)
Doing ad-hoc apps
Mixing up strings, classes, terms, concepts, topics,
tags, etc.
Trivializing transformation from social to
formal semantics (e.g. when translating
a syntactic frame directly to an OWL
construct)
43
43 Some techniques for semantic data
reuse
n 
n 
n 
n 
Virtual linked data
q 
Automatic RDB schema conversion to RDFS
q 
RDB data browsing and on-demand automatic conversion to RDF
q 
Sample tools: Sparql endpoint+D2R
q 
Dataset example: IMDB
q 
+Time to usage –Flexibility
Ontology-based access with ad-hoc queries
q 
(DL-Lite) ontology to be designed separately
q 
Ad-hoc SQL query on RDB, “embedded” in class spec
q 
On demand ontology-based navigation
q 
Sample tools: Mastro+Quonto
q 
+Complexity –Flexibility –Time to usage
Physical linked data with custom ontologies
q 
Custom conversion of RDB/XML to one or more OWL ontologies
q 
Custom conversion of data to RDF-OWL datasets that can be published and queried
q 
Sample tools: Sparql endpoint+Semion
q 
Sample datasets: DBpedia, data.cnr.it
q 
+Flexibility ±Time to usage ±Complexity
Key aspects
q 
Mapping specification
44 q 
Consumable RDF data semantics
44
Transformation patterns
Types of transformation patterns
n 
1.  Direct structural morphism
Broader
Narrow concept
Paris
Broad concept
France
:Broader rdf:type dbs:Table
:Narrow_concept rdf:type dbs:Column
:France rdf:type dbs:Datum
2.  Semantic interpretation
Broader
Narrow concept
Paris
Broad concept
France
:Broader rdf:type owl:ObjectProperty
:Narrow_concept rdf:type owl:Class
:Paris :broader :France
45
45 Transformation patterns (cont.)
3.  Re-interpretation, e.g. through alignment patterns
n 
:Narrow_concept
q 
q 
skos:Concept
also as mediated semantic interpretation
also as revised semantic interpretation
4.  Production: new entities, vocabulary/string manipulation
n 
n 
46
:Narrow_concept
“plant flora plant_life”
:Concept
:Plant, :Flora, :PlantLife
46 Semion – Example method
47
47 A common recipe
•  each table is a rdfs:Class
•  each table record is an owl:Individual
•  each table column is a rdf:Property
Example
Class: Person
DatatypeProperty: firstName
DatatypeProperty: lastName
Individual: Person1
Type: Person
Facts: firstName “Aldo”
lastName “Gangemi”
Individual: Person2
Type: Person
Facts: firstName “Valentina”
lastName “Presutti”
…
Implications
q 
q 
q 
Limited customization of the transformation
process
Difficulty in adopting good practices of
knowledge reengineering and ontology
design
Limited exploitation of OWL expressivity for
describing the domain
just extract RDF triples!
express the domain
semantics
The Semion Reengineer
n  It
does not add any semantics, but just the RDF
format
n  Semion needs the meta-model of the structure
of the source (and some code)
n  Currently supports RDB and XML
n  Supported sources can be extended by
providing new reengineering services as an
OSGi bundle (not available yet)
Basic idea
A meta-model for RDBs
Example of transformation of a DB
Class: Table
ObjectProperty: hasRecord
Domain: Table
Range: Record
inverseOf: isRecordOf
ObjectProperty: isRecordOf
Domain: Table
Range: Record
inverseOf: hasRecord
Individual: Person
Type: Table
Facts: hasRecord AldoGangemi
Individual: AldoGangemi
Types: Record
Facts:
hasDatum AldoGangemiFirstName
hasDatum AldoGangemiLastName
isRecordOf Person
Reengineering Rules
Table -> individual of dbs:Table
n  Column -> individual of dbs:Column
n  Record -> individual of dbs:Record
n  Field -> individual of dbs:Datum
n 
Primary Keys
n 
primary keys are used for URI generation
Individual: Person_stlab.istc-cnr1
Type: Record
Foreign Keys
n  Foreign
keys identifies relations between tables and are
mapped to relations between individuals
Individual: Person_stlab.istc-cnr1
Type: Record
Facts: hasDatum Person_stlab.istc-cnr1_affiliation
Individual: Person_stlab.istc-cnr1_affiliation
Type: Datum
Facts: hasContent Institute_istc-cnr
Individual: Institute_istc-cnr
Type: Record
Facts: hasDatum Institute_istc-cnr_name
…and for XML
The Semion Refactorer
n  Allows
to align a data set expressed with a specific
vocabulary/ontology to another vocabulary/ontology
n  Is expressed as a set of rules
n  Rules are expressed in a human readable syntax
called SemionRule Syntax and can be transformed
into
q 
SWRL rules for reasoning
q 
SPARQL CONSTRUCT for pure refactoring
n  Rules
realize recipes that can be saved (refactoring
patterns)
Basic idea
TBox
ABox
SemionRule Syntax
dbs = <http://ontologydesignpatterns.org/ont/iks/dbs_l1.owl#> .
owl = <http://www.w3.org/2002/07/owl#> .
myRule[
is(dbs:Table, ?x) . has(dbs:hasColumn, ?x, ?y)
->
is(owl:Class, ?x)
]
as a SPARQL CONSTRUCT
PREFIX dbs: <http://ontologydesignpatterns.org/ont/iks/dbs_l1.owl#> .
PREFIX owl: <http://www.w3.org/2002/07/owl#> .
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
CONSTRUCT{ ?x rdf:type owl:Class }
WHERE{
?x rdf:type dbs:Table .
?x dbs:hasColumn ?y
} …and as a SWRL rule
<swrl:Variable rdf:ID=”x"/>
<swrl:Variable rdf:ID=”y"/>
<swrl:Imp>
<swrl:body rdf:parseType="Collection">
<swrl:ClassAtom>
</swrl:ClassAtom>
<swrl:IndividualPropertyAtom>
<swrl:propertyPredicate rdf:resource=”&dbs;hasColumn"/>
<swrl:argument1 rdf:resource="#x" />
<swrl:argument2 rdf:resource=”#y" />
</swrl:IndividualPropertyAtom>
<swrl:classPredicate rdf:resource="&dbs;Table"/>
<swrl:argument1 rdf:resource="#x" />
</swrl:body>
<swrl:head rdf:parseType="Collection">
<swrl:ClassAtom>
</swrl:ClassAtom>
</swrl:head>
</swrl:Imp> <swrl:classPredicate rdf:resource="&owl;Class"/>
<swrl:argument1 rdf:resource="#x" />
Stanbol Rule Syntax
in Stanbol a rule is defined as
ruleName[body -> head]
where:
q 
q 
q 
q 
The ruleName identifies the rule
The body is a set of atoms that must be satisfied when
evaluating the rule
The head or consequent is a set of atoms that must be
true if the condition is evaluated to be true
Both body and head consist of a list of conjunctive atoms
n 
n 
q 
body = atom1 . atom2 . … . atomN
head = atom1 . atom2 . … . atomM
The conjunction ∧ in Stanbol Rules is expressed with the symbol “ . ”
Sample rule
Considering Stanbol Rules, the FOL formula
hasFather(x,y) ∧ hasBrother(y,z) ⇒ hasUncle(x,z)
becomes
myRule[ has(<http//myont.org/hasFather>, ?x, ?y) .
has(<http/myont.org/hasBrother>, ?y, ?z)
->
has(<http//myont.org/hasUncle>, ?x, ?z) ]
Namespace Prefixes
URIs are useful, but sometime too long for humans
n  We can use namespace prefixes instead of full URIs in
rule atoms
n  e.g:
n 
myont = <http://myont.org/> .
myRule[ has(myont:hasFather, ?x, ?y) .
has(myont:hasBrother, ?y, ?z)
->
has(myont:hasUncle, ?x, ?z) ]
Define a refactoring recipe
we want to use the FOAF vocabulary
instead of SKOS
skos:
Concept
rdf:type
nyt:
65169961111056171853
“Marley, Bob”
skos:pre
fLabel
Define a refactoring recipe
skos = <http://www.w3.org/2004/02/skos/
core#> .
foaf = <http://xmlns.com/foaf/0.1/> .
conceptToPerson[ is(skos:Concept, ?x) ->
is(foaf:Person, ?x) ] .
labelRule[ values(skos:prefLabel, ?x, ?y) ->
values(foaf:name, ?x, ?y) ]
Exercise
n 
Download Semion and launch it as
follows:
q 
q 
n 
69
(Mac) java -jar -Xmx512m -XstartOnFirstThread /
LocalPathname/it.cnr.istc.semion.tool-0.6SNAPSHOT.one-jar.jar
(Win) java -jar -Xmx512m \LocalPathname
\it.cnr.istc.semion.tool-0.6-SNAPSHOT.one-jar.jar
Connect to the indicated database and
perform reengineering first, and
alignment (refactoring) second
What about consuming data?
Department of Computer and Information Science (IDA)
Linköpings universitet, Sweden
September 16,
2012
70
Linked Data and Ontologies
n 
Tools listing: http://www.w3.org/2001/sw/wiki/Tools
n 
APIs for handling ontologies
q 
q 
Jena
The OWL API
n 
Communicating with an OWL reasoner: OWLlink protocol
n 
Triple stores
q 
q 
Usually provide SPARQL endpoints
Data storage could be RDB
n 
Linked data browsers and query interfaces
n 
...but end-users should not see the technology behind!
Department of Computer and Information Science (IDA)
Linköpings universitet, Sweden
September 16, 2012
71
Department of Computer and Information Science (IDA)
Linköpings universitet, Sweden
September 13, 2012
72
Download