Linked Data Technology & Status Dr. Myungjin Lee Linked Data & Semantic Web Technology The Semantic Web more vocabulary for describing properties and classes a vocabulary for describing properties and classes of RDF-based resources to exchange rules between many "rules languages" a protocol and query language for semantic web data sources an elemental syntax for content structure within documents a simple language for expressing data models, which refer to objects ("resources") and their relationships a string of characters used to identify a name or a resource Linked Data & Semantic Web Technology http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/#(24) What is Linked Data? Linked data describes a method of publishing structured data so that it can be interlinked and become more useful. The Semantic Web isn't just about putting data on the web. It is about making links, so that a person or machine can explore the web of data. With linked data, when you have some of it, you can find other, related, data. - A roadmap to the Semantic Web by Tim Berners-Lee Linked Data & Semantic Web Technology http://www.w3.org/DesignIssues/LinkedData.html Four Principles of Linked Data 1. Use URIs to identify things. 2. Use HTTP URIs so that these things can be referred to and looked up ("dereferenced") by people and user agents. 3. Provide useful information about the thing when its URI is dereferenced, using standard formats such as RDF/XML. 4. Include links to other, related URIs in the exposed data to improve discovery of other related information on the Web. Linked Data & Semantic Web Technology http://www.w3.org/DesignIssues/LinkedData.html 5 Star Linked Data ★ Available on the web (whatever format) but with an open licence, to be Open Data ★★ Available as machine-readable structured data (e.g. excel instead of image scan of a table) ★★★ as (2) plus non-proprietary format (e.g. CSV instead of excel) ★★★★ All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff ★★★★★ All the above, plus: Link your data to other people’s data to provide context Linked Data & Semantic Web Technology http://www.w3.org/DesignIssues/LinkedData.html The Basic Requirements for Linked Data a vocabulary for describing properties and classes of RDF-based resources a protocol and query language for semantic web data sources an elemental syntax for content structure within documents a simple language for expressing data models, which refer to objects ("resources") and their relationships a string of characters used to identify a name or a resource Linked Data & Semantic Web Technology Linked Data & Semantic Web Technology http://www.google.co.kr/search?q=namdeamun URI, Thing, and Representation looks up Person Machine URI http://data.kdata.kr/resource/Namdaemun refers URI http://data.kdata.kr/resource/Sungnyemun links identifies and names URI Thing http://dbpedia.org/resource/Namdaemun represents Representation <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Namdaemun | kdata.kr</title> <link rel="alternate" type="application/rdf+xml" href="http://data.kdata.kr/data/Namdaemun" title="RDF" /> </head> <body onLoad="init();"> <div id="header"> <div> <h1 id="title">Namdaemun</h1> <div id="homelink"> &nbsp;at <a href="http://kdata.kr">kdata.kr</a> Linked Data & Semantic Web Technology http://www.slideshare.net/lysander07/open-hpi-semweb02part1 Linked Data & Semantic Web Technology http://www.w3.org/TR/cooluris/ URIs for Real-World Objects • Be on the Web – Given only a URI, machines and people should be able to retrieve a description about the resource identified by the URI from the Web. • Be unambiguous – There should be no confusion between identifiers for Web documents and identifiers for other resources. Linked Data & Semantic Web Technology http://www.w3.org/TR/cooluris/ URIs for Real-World Objects <URI-of-alice> a foaf:Person; foaf:name "Alice"; foaf:mbox <mailto:alice@example.com>; foaf:homepage <http://www.example.com/people/alice> . Resource identifier (URI) ID for semantic web applications Linked Data & Semantic Web Technology for web browsers RDF HTML RDF document URI HTML document URI http://www.w3.org/TR/cooluris/ Distinguishing between Representations and Descriptions http://data.kdata.kr/resource/Namdaemun Thing 303 redirect http://data.kdata.kr/page/Namdaemun Generic Document application/rdf+xml content negotiation text/html RDF HTML http://data.kdata.kr/page/Namdaemun.rdf http://data.kdata.kr/page/Namdaemun.html Linked Data & Semantic Web Technology Cool URIs • Simplicity – short and mnemonic • Stability – remain as long as possible • Manageability – issue your URIs in a way that you can manage Linked Data & Semantic Web Technology http://www.w3.org/TR/cooluris/ Designing URI Sets for the UK Public Sector • URIs: – name the set and describe its characteristics – identify for the real-world ‘Things’ in a single concept – provide a means of looking up data on the web – provide mechanisms to: • lookup an Identifier URI and be redirected to its Document URI • discover and get each of the Representation URIs URI Type URI structure Examples Identifier http://{domain}/id/{concept}/{reference} http://education.data.gov.uk/id/school/78 Linked Data & Semantic Web Technology https://www.gov.uk/government/publications/designing-uri-sets-for-the-uk-public-sector http://data.gov.uk/resources/uris URI Design Principles: Creating Unique URIs for Government Linked Data • URI Template: 'http://' BASE '/' 'id' '/' ORG '/' CATEGORY ( '/' TOKEN )+ • States and Territories – Owner • federal – Suggested • http://BASE/id/us/state/NAME – Example • http://logd.tw.rpi.edu/id/us/state/Vermont Linked Data & Semantic Web Technology http://logd.tw.rpi.edu/instance-hub-uri-design XML (Extensible Markup Language) • a textual data format for the representation of arbitrary data structures over the Internet • both human-readable and machine-readable <title> W3C Demonstrates … </title> <date> 12 February 2013 </date> <body> W3C invites media, analysts, and other attendees of Mobile World Congress … </body> Concept Related Recommendations Linked Data & Semantic Web Technology title title date date body body bold1 bold2 bold1 bold2 Content Structure Presentation XML DTD XML Schema XSLT XSL-fo XPath http://en.wikipedia.org/wiki/Xml Data Representation of XML • Various ways to represent data using XML – Myungjin Lee is Hye-jin’s husband. <conjugalrelation> <husband>Myungjin Lee</husband> <wife>Hye-jin Han</wife> </conjugalrelation> <conjugalrelation husband=“Myungjin Lee”> <wife>Hye-jin Han</wife> </conjugalrelation> <conjugalrelation husband=“Myungjin Lee” wife=“Hye-jin Han” /> • We need a method to represent data on abstract level. Linked Data & Semantic Web Technology RDF (Resource Description Framework) • a general method for conceptual description or modeling of information that is implemented in web resources, using a variety of syntax formats – Myungjin Lee is Hye-jin’s husband. hasWife Linked Data & Semantic Web Technology http://en.wikipedia.org/wiki/Resource_Description_Framework Data Representation of RDF hasWife http://semantics.kr/rel/hasWife http://semantics.kr/myungjinlee http://semantics.kr/hye-jinhan Subject Predicate Object URI reference URI reference URI reference or Literal Triple Linked Data & Semantic Web Technology RDF Example http://www.cars.com/car#Gasoline http://www.cars.com/car#fuel http://www.cars.com/car#AWD http://www.cars.com/car#GDI http://www.cars.com/car#drivetrain http://www.cars.com/car#engine http://www.cars.com/car#doors 4 http://www.cars.com/car#wheelbase http://www.cars.com/car#A6 http://www.cars.com/car#body_style 115” http://www.cars.com/car#transmission http://www.cars.com/car#Sedan http://www.cars.com/car#Auto_8-Speed http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.cars.com/car#Car Linked Data & Semantic Web Technology RDF Serialization • N-Triples – RDF Test Cases, W3C Recommendation, 10 February 2004 – a line-based, plain text serialization format for storing and transmitting RDF data • Notation 3 (N3) – a shorthand non-XML serialization of RDF models, designed with humanreadability in mind – much more compact and readable than XML RDF notation • Turtle (Terse RDF Triple Language) – W3C Candidate Recommendation, 19 February 2013 – a format for expressing data in the Resource Description Framework (RDF) data model – a subset of Notation3 (N3) language, and a superset of the minimal NTriples format • RDF/XML – W3C Recommendation, 10 February 2004 – an XML syntax for writing down and exchanging RDF graphs Linked Data & Semantic Web Technology http://en.wikipedia.org/wiki/N-Triples http://en.wikipedia.org/wiki/Notation3 http://en.wikipedia.org/wiki/Turtle_(syntax) N-Triple <http://en.wikipedia.org/wiki/Tony_Benn> <http://purl.org/dc/elements/1.1/title> "Tony Benn" . <http://en.wikipedia.org/wiki/Tony_Benn> <http://purl.org/dc/elements/1.1/publisher> "Wikipedia" . N3 @prefix dc: <http://purl.org/dc/elements/1.1/>. <http://en.wikipedia.org/wiki/Tony_Benn> dc:title "Tony Benn"; dc:publisher "Wikipedia". RDF/XML <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://en.wikipedia.org/wiki/Tony_Benn"> <dc:title>Tony Benn</dc:title> <dc:publisher>Wikipedia</dc:publisher> </rdf:Description> </rdf:RDF> Turtle @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix ex: <http://example.org/stuff/1.0/> . <http://www.w3.org/TR/rdf-syntax-grammar> dc:title "RDF/XML Syntax Specification (Revised)" ; ex:editor [ ex:fullname "Dave Beckett"; ex:homePage <http://purl.org/net/dajobe/> ] . Linked Data & Semantic Web Technology Linked Data & Semantic Web Technology http://www.w3.org/TR/rdf11-concepts/ RDF 1.0 vs RDF 1.1 RDF 1.0 RDF 1.1 Resource Identification URI IRI (Internationalized Resource Identifier) Multiple RDF Graphs X O HTML content for literal value X rdf:HTML Linked Data & Semantic Web Technology Recommendations of RDF Linked Data & Semantic Web Technology http://www.w3.org/standards/techs/rdf#w3c_all RDF Schema • W3C Recommendation, 10 February 2004 • to define classes and properties that may be used to describe classes, properties and other resources • RDF Schema allows – Definition of Classes – Definition of Properties and Restrictions – Definition of Hierarchies Linked Data & Semantic Web Technology http://www.slideshare.net/lysander07/openhpi-22 RDF Schema Example TBox - terminological component rdf:type car:Vehicle rdf:Property rdfs:Class rdfs:subClassOf rdf:type car:Car rdf:type rdfs:domain car:body_style rdf:type rdfs:range car:A6 car:body_style ABox - assertion component Linked Data & Semantic Web Technology car:Sedan rdf:type car:Style RDF Semantics • to provide a formal meaning based on a modeltheoretic semantics in its abstract syntax <x, y> is in IEXT(I(rdfs:subClassOf)) if and only if x and y are in IC and ICEXT(x) is a subset of ICEXT(y) car:Vehicle rdfs:subClassOf car:Car rdf:type car:A6 Linked Data & Semantic Web Technology rdf:type SPARQL • Why do we need a query language for RDF? – Why de we need a query language for RDB? – to get to the knowledge from RDF • SPARQL Protocol and RDF Query Language – to retrieve and manipulate data stored in Resource Description Framework format – to use SPARQL via HTTP Linked Data & Semantic Web Technology http://www.slideshare.net/lysander07/openhpi-semweb03part1 SPARQL Example PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?email WHERE { ?person a foaf:Person. ?person foaf:name ?name. ?person foaf:mbox ?email. } RDF Knowledge Base ?name ?email Myungjin Lee mjlee@li-st.com Gildong Hong gildong@daum.net Grace Byun grace@naver.com Linked Data & Semantic Web Technology SPARQL Query Forms • SELECT query – Used to extract raw values from a SPARQL endpoint, the results are returned in a table format. • CONSTRUCT query – Used to extract information from the SPARQL endpoint and transform the results into valid RDF. • ASK query – Used to provide a simple True/False result for a query on a SPARQL endpoint. • DESCRIBE query – Used to extract an RDF graph from the SPARQL endpoint, the contents of which is left to the endpoint to decide based on what the maintainer deems as useful information. Linked Data & Semantic Web Technology http://en.wikipedia.org/wiki/SPARQL OWL (Web Ontology Language) • knowledge representation languages for authoring ontologies • If you need more expressiveness OWL – such as, Man ∩ Woman descendant Person =Ø descendant Person Person Genre type descendant hasGenre _01 1:1 Husband Wife subClassOf ActionMovie Linked Data & Semantic Web Technology Action What more do we need? SPARQL Linked Data Platform Linked Data Service RDFa RDBMS R2RML Linked Data & Semantic Web Technology HTML HTML HTML Triple Store GRDDL + RDF Knowledge http://www.w3.org/TR/r2rml/ R2RML • RDB to RDF Mapping Language • W3C Recommendation 27 September 2012 • a language for expressing customized mappings from relational databases to RDF datasets RDB R2RML @prefix rr: <http://www.w3.org/ns/r2rml#>. @prefix ex: <http://example.com/ns#>. <#TriplesMap1> rr:logicalTable [ rr:tableName "EMP" ]; rr:subjectMap [ rr:template "http://data.example.com/employee/{EMPNO}"; rr:class ex:Employee; ]; rr:predicateObjectMap [ rr:predicate ex:name; rr:objectMap [ rr:column "ENAME" ]; ]. Result Linked Data & Semantic Web Technology <http://data.example.com/employee/7369> rdf:type ex:Employee. <http://data.example.com/employee/7369> ex:name "SMITH". Linked Data Platform • A set of best practices and simple approach for a read-write Linked Data architecture, based on HTTP access to web resources that describe their state using RDF • W3C Working Draft 25 October 2012 Linked Data & Semantic Web Technology http://www.w3.org/TR/ldp/ RDFa (the Resource Description Framework in attributes) • W3C Recommendation, 07 June 2012 • to express machine-readable data in Web documents like HTML, SVG, and XML Example<p vocab="http://schema.org/" resource="#manu" typeof="Person"> My name is <span property="name">Manu Sporny</span> and you can give me a ring via <span property="telephone">1-800-555-0199</span>. <img property="image" src="http://manu.sporny.org/images/manu.png" /> </p> Linked Data & Semantic Web Technology http://www.w3.org/TR/xhtml-rdfa-primer/ http://www.w3.org/TR/grddl/ GRDDL (Gleaning Resource Descriptions from Dialects of Languages) • a mechanism and markup format for Gleaning Resource Descriptions from Dialects of Languages to obtain RDF triples out of XML documents, including XHTML HTML <html xmlns:grddl='http://www.w3.org/2003/g/data-view#' grddl:transformation="glean_title.xsl getAuthor.xsl"> <head> <title>Are You Experienced?</title> </head> glean_title.xsl ... <xsl:stylesheet version="1.0"> <xsl:template match="/"> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="{$subject}"> <dc:title> <xsl:value-of select="/html:html/html:head/html:title"/> </dc:title> </rdf:Description> </rdf:RDF> </xsl:template> </xsl:stylesheet> RDF Linked Data & Semantic Web Technology <rdf:RDF> <rdf:Description rdf:about=""> <dc:title>Are You Experienced?</dc:title> </rdf:Description> </rdf:RDF> Jena Platform SPARQL Fuseki ARQ & LARQ Linked Data Service Jena API RDBMS Triple Store HTML HTML HTML TDB & SDB Linked Data & Semantic Web Technology http://jena.apache.org/ Openlink Virtuoso • a middleware and database engine hybrid that combines the functionality of a traditional RDBMS, ORDBMS, RDF, XML, etc. – – – – – – – – Relational Data Management RDF Data Management XML Data Management Free Text Content Management & Full Text Indexing Document Web Server Linked Data Server Web Application Server Web Services Deployment (SOAP or REST) Linked Data & Semantic Web Technology http://virtuoso.openlinksw.com/ Openlink Virtuoso Coverage SPARQL SPARQL Server Linked Data Service RDBMS Triple Store Sponger Linked Data & Semantic Web Technology Storage and Inference HTML HTML HTML http://lod-cloud.net/ The Linking Open Data cloud diagram Linked Data & Semantic Web Technology User Generated Content Media Publications Government Domain Number of datasets Triples (Out-)Links Media 25 18,4185,2061 5044,0705 Geographic 31 61,4553,2484 3581,2328 Government 49 133,1500,9400 1934,3519 Publications 87 29,5072,0693 1,3992,5218 Cross-domain 41 41,8463,5715 6318,3065 Life Sciences 41 30,3633,6004 1,9184,4090 User-generated Content 20 1,3412,7413 344,9143 Total 295 316,3421,3770 5,0399,8829 Geographic Life Sciences Cross-Domain Linked Data & Semantic Web Technology http://www.slideshare.net/lysander07/13-semantic-web-technologies-linked-data-semantic-search KDATA (Linked Data for Korea) Domain 국가코드 엔터테인먼트 행정구역 초중고등학교 교육청 대학교 사회적 기업 서울시 개방 화장실 야구선수 및 팀 지하철역 역사 행정데이터표준용어 한옥마을 공공 WiFi설치정보 KDATA 분류용어 전통시장 국립공원 문화재 공공체육시설 생물분류 문화시설 공원정보 및 프로그램 가격안정모범업소 가격안정모범업소 상품목록 공공시설물 인증제품 제설함 위치정보 야생동식물정보 야생동식물 출현정보 합계 Linked Data & Semantic Web Technology Triples 3,899 44,278 2,969 126,469 1,130 2,833 5,539 47,340 228,872 4,450 5,392 109,101 1,155 1,671 808 4,535 10,605 80,156 49,799 3,256 9,418 2,429 16,212 14,300 6,931 39,218 115,099 139,608 1,077,472 http://kdata.kr/index.jsp SPARQL select ?s where { ?s rdf:type <http://data.kdata.kr/class/NationalTreasure> . ?s rdfs:label "남대문" . } HTML http://data.kdata.kr/resource/Namdaemun <rdf:RDF> <rdf:Description rdf:about="http://data.kdata.kr/data/Namdaemun?output=rdfxml"> <rdfs:label>RDF description of Namdaemun</rdfs:label> <foaf:primaryTopic> <kdc:StateDesignatedHeritage rdf:about="http://data.kdata.kr/resource/Namdaemun"> <rdfs:label>남대문</rdfs:label> <rdfs:label>숭례문</rdfs:label> <foaf:depiction rdf:resource="20060227132556895000.jpg"/> <owl:sameAs rdf:resource="http://dbpedia.org/resource/Namdaemun"/> ... </rdf:RDF> RDF Linked Data & Semantic Web Technology Contents Search on the Semantic Web Dr. Myungjin Lee e-Mail : mjlee@li-st.com Twitter : http://twitter.com/MyungjinLee Facebook : http://www.facebook.com/mjinlee SlideShare : http://www.slideshare.net/onlyjiny/