RDF - 닥치고 Linked Data

advertisement
Linked Data
Technology & Status
Dr. Myungjin Lee
Linked Data & Semantic Web Technology
The Semantic Web
more vocabulary
for describing properties and classes
a vocabulary for describing
properties and classes
of RDF-based resources
to exchange rules
between many "rules languages"
a protocol and query language
for semantic web data sources
an elemental syntax
for content structure
within documents
a simple language
for expressing data models,
which refer to objects ("resources")
and their relationships
a string of characters used to identify a name or a resource
Linked Data & Semantic Web Technology
http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/#(24)
What is Linked Data?
Linked data describes a method of publishing structured
data so that it can be interlinked and become more useful.
The Semantic Web isn't just about
putting data on the web. It is about
making links, so that a person or
machine can explore the web of data.
With linked data, when you have some of
it, you can find other, related, data.
- A roadmap to the Semantic Web by Tim Berners-Lee
Linked Data & Semantic Web Technology
http://www.w3.org/DesignIssues/LinkedData.html
Four Principles of Linked Data
1. Use URIs to identify things.
2. Use HTTP URIs so that these things can be referred
to and looked up ("dereferenced") by people and user
agents.
3. Provide useful information about the thing when its
URI is dereferenced, using standard formats such as
RDF/XML.
4. Include links to other, related URIs in the exposed
data to improve discovery of other related
information on the Web.
Linked Data & Semantic Web Technology
http://www.w3.org/DesignIssues/LinkedData.html
5 Star Linked Data
★ Available on the web (whatever format) but with an
open licence, to be Open Data
★★ Available as machine-readable structured data (e.g.
excel instead of image scan of a table)
★★★ as (2) plus non-proprietary format (e.g. CSV instead
of excel)
★★★★ All the above plus, Use open standards from W3C
(RDF and SPARQL) to identify things, so that people
can point at your stuff
★★★★★ All the above, plus: Link your data to other people’s
data to provide context
Linked Data & Semantic Web Technology
http://www.w3.org/DesignIssues/LinkedData.html
The Basic Requirements for Linked Data
a vocabulary for describing
properties and classes
of RDF-based resources
a protocol and query language
for semantic web data sources
an elemental syntax
for content structure
within documents
a simple language
for expressing data models,
which refer to objects ("resources")
and their relationships
a string of characters used to identify a name or a resource
Linked Data & Semantic Web Technology
Linked Data & Semantic Web Technology
http://www.google.co.kr/search?q=namdeamun
URI, Thing, and Representation
looks up
Person
Machine
URI
http://data.kdata.kr/resource/Namdaemun
refers
URI
http://data.kdata.kr/resource/Sungnyemun
links
identifies
and
names
URI
Thing
http://dbpedia.org/resource/Namdaemun
represents
Representation
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>Namdaemun | kdata.kr</title>
<link rel="alternate" type="application/rdf+xml" href="http://data.kdata.kr/data/Namdaemun" title="RDF" />
</head>
<body onLoad="init();">
<div id="header">
<div>
<h1 id="title">Namdaemun</h1>
<div id="homelink">  at <a href="http://kdata.kr">kdata.kr</a>
Linked Data & Semantic Web Technology
http://www.slideshare.net/lysander07/open-hpi-semweb02part1
Linked Data & Semantic Web Technology
http://www.w3.org/TR/cooluris/
URIs for Real-World Objects
• Be on the Web
– Given only a URI, machines and people should be
able to retrieve a description about the resource
identified by the URI from the Web.
• Be unambiguous
– There should be no confusion between identifiers for
Web documents and identifiers for other resources.
Linked Data & Semantic Web Technology
http://www.w3.org/TR/cooluris/
URIs for Real-World Objects
<URI-of-alice> a foaf:Person;
foaf:name "Alice";
foaf:mbox <mailto:alice@example.com>;
foaf:homepage <http://www.example.com/people/alice> .
Resource identifier (URI)
ID
for semantic web applications
Linked Data & Semantic Web Technology
for web browsers
RDF
HTML
RDF document URI
HTML document URI
http://www.w3.org/TR/cooluris/
Distinguishing between Representations and Descriptions
http://data.kdata.kr/resource/Namdaemun
Thing
303 redirect
http://data.kdata.kr/page/Namdaemun
Generic
Document
application/rdf+xml
content
negotiation
text/html
RDF
HTML
http://data.kdata.kr/page/Namdaemun.rdf
http://data.kdata.kr/page/Namdaemun.html
Linked Data & Semantic Web Technology
Cool URIs
• Simplicity
– short and mnemonic
• Stability
– remain as long as possible
• Manageability
– issue your URIs in a way that you can manage
Linked Data & Semantic Web Technology
http://www.w3.org/TR/cooluris/
Designing URI Sets for the UK Public Sector
• URIs:
– name the set and describe its characteristics
– identify for the real-world ‘Things’ in a single
concept
– provide a means of looking up data on the web
– provide mechanisms to:
• lookup an Identifier URI and be redirected to its Document
URI
• discover and get each of the Representation URIs
URI Type
URI structure
Examples
Identifier
http://{domain}/id/{concept}/{reference}
http://education.data.gov.uk/id/school/78
Linked Data & Semantic Web Technology
https://www.gov.uk/government/publications/designing-uri-sets-for-the-uk-public-sector
http://data.gov.uk/resources/uris
URI Design Principles:
Creating Unique URIs for Government Linked Data
• URI Template:
'http://' BASE '/' 'id' '/' ORG '/' CATEGORY ( '/' TOKEN )+
• States and Territories
– Owner
• federal
– Suggested
• http://BASE/id/us/state/NAME
– Example
• http://logd.tw.rpi.edu/id/us/state/Vermont
Linked Data & Semantic Web Technology
http://logd.tw.rpi.edu/instance-hub-uri-design
XML (Extensible Markup Language)
• a textual data format for the representation of
arbitrary data structures over the Internet
• both human-readable and machine-readable
<title>
W3C Demonstrates …
</title>
<date>
12 February 2013
</date>
<body>
W3C invites media,
analysts, and other attendees
of Mobile World Congress
…
</body>
Concept
Related
Recommendations
Linked Data & Semantic Web Technology
title
title
date
date
body
body
bold1
bold2
bold1
bold2
Content
Structure
Presentation
XML
DTD
XML Schema
XSLT
XSL-fo
XPath
http://en.wikipedia.org/wiki/Xml
Data Representation of XML
• Various ways to represent data using XML
– Myungjin Lee is Hye-jin’s husband.
<conjugalrelation>
<husband>Myungjin Lee</husband>
<wife>Hye-jin Han</wife>
</conjugalrelation>
<conjugalrelation husband=“Myungjin Lee”>
<wife>Hye-jin Han</wife>
</conjugalrelation>
<conjugalrelation husband=“Myungjin Lee” wife=“Hye-jin Han” />
• We need a method to represent data on abstract
level.
Linked Data & Semantic Web Technology
RDF (Resource Description Framework)
• a general method for conceptual description or
modeling of information that is implemented in
web resources, using a variety of syntax formats
– Myungjin Lee is Hye-jin’s husband.
hasWife
Linked Data & Semantic Web Technology
http://en.wikipedia.org/wiki/Resource_Description_Framework
Data Representation of RDF
hasWife
http://semantics.kr/rel/hasWife
http://semantics.kr/myungjinlee
http://semantics.kr/hye-jinhan
Subject
Predicate
Object
URI reference
URI reference
URI reference or Literal
Triple
Linked Data & Semantic Web Technology
RDF Example
http://www.cars.com/car#Gasoline
http://www.cars.com/car#fuel
http://www.cars.com/car#AWD
http://www.cars.com/car#GDI
http://www.cars.com/car#drivetrain
http://www.cars.com/car#engine
http://www.cars.com/car#doors
4
http://www.cars.com/car#wheelbase
http://www.cars.com/car#A6
http://www.cars.com/car#body_style
115”
http://www.cars.com/car#transmission
http://www.cars.com/car#Sedan
http://www.cars.com/car#Auto_8-Speed
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.cars.com/car#Car
Linked Data & Semantic Web Technology
RDF Serialization
• N-Triples
– RDF Test Cases, W3C Recommendation, 10 February 2004
– a line-based, plain text serialization format for storing and transmitting
RDF data
• Notation 3 (N3)
– a shorthand non-XML serialization of RDF models, designed with humanreadability in mind
– much more compact and readable than XML RDF notation
• Turtle (Terse RDF Triple Language)
– W3C Candidate Recommendation, 19 February 2013
– a format for expressing data in the Resource Description Framework (RDF)
data model
– a subset of Notation3 (N3) language, and a superset of the minimal NTriples format
• RDF/XML
– W3C Recommendation, 10 February 2004
– an XML syntax for writing down and exchanging RDF graphs
Linked Data & Semantic Web Technology
http://en.wikipedia.org/wiki/N-Triples
http://en.wikipedia.org/wiki/Notation3
http://en.wikipedia.org/wiki/Turtle_(syntax)
N-Triple
<http://en.wikipedia.org/wiki/Tony_Benn> <http://purl.org/dc/elements/1.1/title> "Tony Benn" .
<http://en.wikipedia.org/wiki/Tony_Benn> <http://purl.org/dc/elements/1.1/publisher> "Wikipedia" .
N3
@prefix dc: <http://purl.org/dc/elements/1.1/>.
<http://en.wikipedia.org/wiki/Tony_Benn>
dc:title "Tony Benn";
dc:publisher "Wikipedia".
RDF/XML
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<rdf:Description rdf:about="http://en.wikipedia.org/wiki/Tony_Benn">
<dc:title>Tony Benn</dc:title>
<dc:publisher>Wikipedia</dc:publisher>
</rdf:Description>
</rdf:RDF>
Turtle
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix ex: <http://example.org/stuff/1.0/> .
<http://www.w3.org/TR/rdf-syntax-grammar>
dc:title "RDF/XML Syntax Specification (Revised)" ;
ex:editor [ ex:fullname "Dave Beckett";
ex:homePage <http://purl.org/net/dajobe/>
] .
Linked Data & Semantic Web Technology
Linked Data & Semantic Web Technology
http://www.w3.org/TR/rdf11-concepts/
RDF 1.0 vs RDF 1.1
RDF 1.0
RDF 1.1
Resource Identification
URI
IRI (Internationalized
Resource Identifier)
Multiple RDF Graphs
X
O
HTML content for literal
value
X
rdf:HTML
Linked Data & Semantic Web Technology
Recommendations of RDF
Linked Data & Semantic Web Technology
http://www.w3.org/standards/techs/rdf#w3c_all
RDF Schema
• W3C Recommendation, 10 February 2004
• to define classes and properties that may be
used to describe classes, properties and other
resources
• RDF Schema allows
– Definition of Classes
– Definition of Properties and Restrictions
– Definition of Hierarchies
Linked Data & Semantic Web Technology
http://www.slideshare.net/lysander07/openhpi-22
RDF Schema Example
TBox - terminological component
rdf:type
car:Vehicle
rdf:Property
rdfs:Class
rdfs:subClassOf
rdf:type
car:Car
rdf:type
rdfs:domain
car:body_style
rdf:type
rdfs:range
car:A6
car:body_style
ABox - assertion component
Linked Data & Semantic Web Technology
car:Sedan
rdf:type
car:Style
RDF Semantics
• to provide a formal meaning based on a modeltheoretic semantics in its abstract syntax
<x, y> is in IEXT(I(rdfs:subClassOf))
if and only if x and y are in IC
and ICEXT(x) is a subset of ICEXT(y)
car:Vehicle
rdfs:subClassOf
car:Car
rdf:type
car:A6
Linked Data & Semantic Web Technology
rdf:type
SPARQL
• Why do we need a query language for RDF?
– Why de we need a query language for RDB?
– to get to the knowledge from RDF
• SPARQL Protocol and RDF Query Language
– to retrieve and manipulate data stored in Resource
Description Framework format
– to use SPARQL via HTTP
Linked Data & Semantic Web Technology
http://www.slideshare.net/lysander07/openhpi-semweb03part1
SPARQL Example
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?email
WHERE {
?person a foaf:Person.
?person foaf:name ?name.
?person foaf:mbox ?email.
}
RDF Knowledge Base
?name
?email
Myungjin Lee
mjlee@li-st.com
Gildong Hong
gildong@daum.net
Grace Byun
grace@naver.com
Linked Data & Semantic Web Technology
SPARQL Query Forms
• SELECT query
– Used to extract raw values from a SPARQL endpoint, the
results are returned in a table format.
• CONSTRUCT query
– Used to extract information from the SPARQL endpoint
and transform the results into valid RDF.
• ASK query
– Used to provide a simple True/False result for a query on
a SPARQL endpoint.
• DESCRIBE query
– Used to extract an RDF graph from the SPARQL
endpoint, the contents of which is left to the endpoint to
decide based on what the maintainer deems as useful
information.
Linked Data & Semantic Web Technology
http://en.wikipedia.org/wiki/SPARQL
OWL (Web Ontology Language)
• knowledge representation languages for
authoring ontologies
• If you need more expressiveness  OWL
– such as,
Man
∩
Woman
descendant
Person
=Ø
descendant
Person
Person
Genre
type
descendant
hasGenre
_01
1:1
Husband
Wife
subClassOf
ActionMovie
Linked Data & Semantic Web Technology
Action
What more do we need?
SPARQL
Linked Data Platform
Linked Data Service
RDFa
RDBMS
R2RML
Linked Data & Semantic Web Technology
HTML
HTML
HTML
Triple Store
GRDDL
+
RDF
Knowledge
http://www.w3.org/TR/r2rml/
R2RML
• RDB to RDF Mapping Language
• W3C Recommendation 27 September 2012
• a language for expressing customized mappings
from relational databases to RDF datasets
RDB
R2RML
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.com/ns#>.
<#TriplesMap1>
rr:logicalTable [ rr:tableName "EMP" ];
rr:subjectMap [
rr:template "http://data.example.com/employee/{EMPNO}";
rr:class ex:Employee;
];
rr:predicateObjectMap [
rr:predicate ex:name;
rr:objectMap [ rr:column "ENAME" ];
].
Result
Linked Data & Semantic Web Technology
<http://data.example.com/employee/7369> rdf:type ex:Employee.
<http://data.example.com/employee/7369> ex:name "SMITH".
Linked Data Platform
• A set of best practices and simple approach for
a read-write Linked Data architecture, based on
HTTP access to web resources that describe
their state using RDF
• W3C Working Draft 25 October 2012
Linked Data & Semantic Web Technology
http://www.w3.org/TR/ldp/
RDFa (the Resource Description Framework in attributes)
• W3C Recommendation, 07 June 2012
• to express machine-readable data in Web
documents like HTML, SVG, and XML
Example<p vocab="http://schema.org/" resource="#manu" typeof="Person">
My name is
<span property="name">Manu Sporny</span>
and you can give me a ring via
<span property="telephone">1-800-555-0199</span>.
<img property="image" src="http://manu.sporny.org/images/manu.png" />
</p>
Linked Data & Semantic Web Technology
http://www.w3.org/TR/xhtml-rdfa-primer/
http://www.w3.org/TR/grddl/
GRDDL (Gleaning Resource Descriptions from Dialects of Languages)
• a mechanism and markup format for Gleaning
Resource Descriptions from Dialects of
Languages to obtain RDF triples out of XML
documents, including XHTML
HTML
<html xmlns:grddl='http://www.w3.org/2003/g/data-view#'
grddl:transformation="glean_title.xsl getAuthor.xsl">
<head>
<title>Are You Experienced?</title>
</head>
glean_title.xsl
...
<xsl:stylesheet version="1.0">
<xsl:template match="/">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="{$subject}">
<dc:title>
<xsl:value-of select="/html:html/html:head/html:title"/>
</dc:title>
</rdf:Description>
</rdf:RDF>
</xsl:template>
</xsl:stylesheet>
RDF
Linked Data & Semantic Web Technology
<rdf:RDF>
<rdf:Description rdf:about="">
<dc:title>Are You Experienced?</dc:title>
</rdf:Description>
</rdf:RDF>
Jena Platform
SPARQL
Fuseki
ARQ & LARQ
Linked Data Service
Jena API
RDBMS
Triple Store
HTML
HTML
HTML
TDB & SDB
Linked Data & Semantic Web Technology
http://jena.apache.org/
Openlink Virtuoso
• a middleware and database engine hybrid that
combines the functionality of a traditional
RDBMS, ORDBMS, RDF, XML, etc.
–
–
–
–
–
–
–
–
Relational Data Management
RDF Data Management
XML Data Management
Free Text Content Management & Full Text
Indexing
Document Web Server
Linked Data Server
Web Application Server
Web Services Deployment (SOAP or REST)
Linked Data & Semantic Web Technology
http://virtuoso.openlinksw.com/
Openlink Virtuoso Coverage
SPARQL
SPARQL Server
Linked Data Service
RDBMS
Triple Store
Sponger
Linked Data & Semantic Web Technology
Storage and Inference
HTML
HTML
HTML
http://lod-cloud.net/
The Linking Open Data cloud diagram
Linked Data & Semantic Web Technology
User Generated Content
Media
Publications
Government
Domain
Number of datasets
Triples
(Out-)Links
Media
25
18,4185,2061
5044,0705
Geographic
31
61,4553,2484
3581,2328
Government
49
133,1500,9400
1934,3519
Publications
87
29,5072,0693
1,3992,5218
Cross-domain
41
41,8463,5715
6318,3065
Life Sciences
41
30,3633,6004
1,9184,4090
User-generated Content
20
1,3412,7413
344,9143
Total
295
316,3421,3770
5,0399,8829
Geographic
Life Sciences
Cross-Domain
Linked Data & Semantic Web Technology
http://www.slideshare.net/lysander07/13-semantic-web-technologies-linked-data-semantic-search
KDATA (Linked Data for Korea)
Domain
국가코드
엔터테인먼트
행정구역
초중고등학교
교육청
대학교
사회적 기업
서울시 개방 화장실
야구선수 및 팀
지하철역
역사
행정데이터표준용어
한옥마을
공공 WiFi설치정보
KDATA 분류용어
전통시장
국립공원
문화재
공공체육시설
생물분류
문화시설
공원정보 및 프로그램
가격안정모범업소
가격안정모범업소 상품목록
공공시설물 인증제품
제설함 위치정보
야생동식물정보
야생동식물 출현정보
합계
Linked Data & Semantic Web Technology
Triples
3,899
44,278
2,969
126,469
1,130
2,833
5,539
47,340
228,872
4,450
5,392
109,101
1,155
1,671
808
4,535
10,605
80,156
49,799
3,256
9,418
2,429
16,212
14,300
6,931
39,218
115,099
139,608
1,077,472
http://kdata.kr/index.jsp
SPARQL
select ?s
where {
?s rdf:type <http://data.kdata.kr/class/NationalTreasure> .
?s rdfs:label "남대문" .
}
HTML
http://data.kdata.kr/resource/Namdaemun
<rdf:RDF>
<rdf:Description rdf:about="http://data.kdata.kr/data/Namdaemun?output=rdfxml">
<rdfs:label>RDF description of Namdaemun</rdfs:label>
<foaf:primaryTopic>
<kdc:StateDesignatedHeritage rdf:about="http://data.kdata.kr/resource/Namdaemun">
<rdfs:label>남대문</rdfs:label>
<rdfs:label>숭례문</rdfs:label>
<foaf:depiction rdf:resource="20060227132556895000.jpg"/>
<owl:sameAs rdf:resource="http://dbpedia.org/resource/Namdaemun"/>
...
</rdf:RDF>
RDF
Linked Data & Semantic Web Technology
Contents Search on the Semantic Web
Dr. Myungjin Lee
e-Mail : mjlee@li-st.com
Twitter : http://twitter.com/MyungjinLee
Facebook : http://www.facebook.com/mjinlee
SlideShare : http://www.slideshare.net/onlyjiny/
Download